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Preface 



This volume contains the 39 contributed papers and two invited papers presented 
at the 8th Annual European Symposium on Algorithms, held in Saarbriicken, 
Germany, 5-8 September 2000. ESA 2000 continued the sequence: 

— 1993 Bad Honnef (Germany) 

~ 1994 Utrecht (The Netherlands) 

— 1995 Gorfu (Greece) 

~ 1996 Barcelona (Spain) 

— 1997 Graz (Austria) 

~ 1998 Venice (Italy) and 

— 1999 Prague (Gzech Republic). 



The proceedings of these previous meetings were published as Springer LNGS 
volumes 726, 855, 979, 1136, 1284, 1461, and 1643. 

Papers were solicited in all areas of algorithmic research, including approxi- 
mation algorithms, combinatorial optimization, computational biology, compu- 
tational geometry, databases and information retrieval, external-memory algo- 
rithms, graph and network algorithms, machine learning, on-line algorithms, 
parallel and distributed computing, pattern matching and data compression, 
randomized algorithms, and symbolic computation. Algorithms could be sequen- 
tial, distributed, or parallel, and should be analyzed either mathematically or 
by rigorous computational experiments. Experimental and applied research were 
especially encouraged. 

Each extended abstract submitted was read by at least three referees, and 
evaluated on its quality, originality and relevance to the Symposium. The entire 
Programme Gommittee met at the University of Warwick on 5-6 May 2000 and 
selected 39 papers for presentation. These, together with the two invited papers 
by Monika Henzinger and Thomas Lengauer, are included in this volume. 

The Programme Gommittee was: 



Lars Arge 
Yossi Azar 
Leslie Goldberg 
Mike Jiinger 
Danny Krizanc 
Alessandro Panconesi 



(Duke) 
(Tel Aviv) 
(Warwick) 
(Koln) 
(Wesleyan) 
(Bologna) 



Mike Paterson 
Marco Pellegrini 
R. Ravi 

Jan van Leeuwen 
Emo Welzl 



(Ghair; Warwick) 
(GNR, Pisa) 
(Garnegie-Mellon) 
(Utrecht) 
(ETH, Ziirich) 



As an experiment, ESA 2000 was held as a combined conference (GONE 2000) 
together with the Workshops on Algorithmic Engineering (WAE 2000) and Ap- 
proximation Algorithms for Gombinatorial Optimization Problems (APPROX 
2000), in Saarbriicken. The Organizing Gommittee consisted of Uwe Brahm, 
Jop Sibeyn (Ghair), Ghristoph Storb, and Roxane Wetzel. 
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Preface 



ESA 2000 was sponsored by the European Association for Theoretical Com- 
puter Science (EATCS), and was supported by the Max-Planck-Institut fiir In- 
formatik. 

We hope that this volume offers the reader a representative selection of some 
of the best current research on algorithms. 
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Web Information Retrieval - an Algorithmic 

Perspective 



Monika Henzinger^ 



Google, Inc., Mountain View, CA 94043 USA, 
monikaOgoogle . com, 

WWW home page: http://www.henzinger.com/monika/index.html 



Abstract. In this paper we survey algorithmic aspects of Web informa- 
tion retrieval. As an example, we discuss ranking of search engine results 
using connectivity analysis. 



1 Introduction 

In December 1999 the World Wide Web was estimated to consist of at least one 
billion pages, up from at least 800 million in February 1999 [28]. Not surprisingly 
finding information in this large set of pages is difficult and many Web users turn 
to Web search engines for help. An estimated 150 million queries are currently 
asked to Web search engines per day - with mixed success. 

In this paper we discuss algorithmic aspects of Web information retrieval 
and present an algorithm due to Brin and Page [6] that is very successful in 
distinguishing high-quality from low-quality Web pages, thereby improving the 
quality of query results significantly. It is currently used by the search engine 
Google^. 

This paper is loosely based on part of a tutorial talk that we presented with 
Andrei Broder at the 39th Annual Symposium on Foundations of Computer 
Science (FOGS 98) in Palo Alto, California. See [8] for write-up of the remaining 
parts of this tutorial. 

We will use the terms page and document interchangeably. 



2 Algorithmic Aspects of Web Information Retrieval 

The goal of general purpose search engines is to index a sizeable portion of the 
Web, independently of topic and domain. Each such engine consists of three 
major components: 

~ A crawler (also called spider or robot) collects documents by recursively fetch- 
ing links from a set of starting pages. Each crawler has different policies with 
respect to which links are followed, how deeply various sites are explored, etc. 
Thus, the pages indexed by various search engines differ considerably [27,2]. 

^ http://www.google.com/ 

M. Paterson (Ed.): ESA 2000, LNCS 1879, pp. 1-8, 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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— The indexer processes the pages collected by the crawler. First it decides 
which of them to index. For example, it might discard duplicate documents. 
Then it builds various data structures representing the pages. Most search 
engines build some variant of an inverted index data structure (see below). 
However, the details of the representation differ among the major search 
engines. For example, they have different policies with respect to which words 
are indexed, capitalization stemming, whether locations within documents 
are stored, etc. The indexer might also build additional data structures, like 
a repository to store the original pages, a Web graph representation to store 
the hyperlinks, a related-pages finder to store related pages, etc. As a result, 
the query capabilities and features of the result pages of various engines vary 
considerably. 

— The query processor processes user queries and returns matching answers, 
in an order determined by a ranking algorithm. It transforms the input into 
a standard format (e.g. to lower-case terms), uses the index to find the 
matching documents, and orders (ranks) them. 

Algorithmic issues arise in each part. We discuss some of them in the follow- 
ing. 



2.1 Crawler 

The crawler needs to decide which pages to crawl. One possible implementation 
is to assign to each page a priority score indicating its crawling importance, and 
to maintain all pages in a priority queue ordered by priority score. This priority 
score can be the PageRank score (defined in the next section) of the page [12] if 
the goal is to maximize the quality of the pages in the index. Alternatively, the 
score can be a criterion whose goal is to maximize the freshness of the pages in 
the index [14,11]. 

The crawler also has to consider various load-balancing issues. It should not 
overload any of the servers that it crawls and it is also limited by its own band- 
width and internal processing capabilities. An interesting research topic is to 
design a crawling strategy that maximizes both quality and freshness and re- 
spects the above load-balancing issues. 

2.2 Indexer 

The indexer builds all the data structures needed at query time. These include 
the inverted index, a URL-database, and potentially a graph representation, a 
document repository, a related-pages finder, and further data structures. 

The inverted index contains for each word a list of all documents containing 
the word, potentially together with the position of the word in the document. 
This list is sorted lexicographically according to the (document-id, position in 
the document)-pair. (See [1] for a detailed description of this data structure.) 

To save space, documents are represented by document-ids in the index and 
the other data structures. When results are displayed, these document-ids need 
to be converted back to the URL. This is done using the URL-database. 
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The graph representation keeps for each document all the documents pointing 
to it and all the documents it points to. (See [4] for a potential implementation.) 

The document repository stores for each document-id the original document. 

The related-pages finder stores for each document-id all document-ids of 
pages that are related to the document. A related page is a page that addresses 
the same topic as the original page, but is not necessarily semantically identical. 
For example, given www.nytimes . com other newspapers and news organizations 
on the Web would be related pages. See [15] for algorithms that find related 
pages. 

Of course all or some of the latter four data structures can be combined. 

Before building the data structures the indexer needs to determine which 
pages to index. For this, it can assign a numerical score to each page and then 
index a certain number of top-ranked pages. For example this score can be 0 
for all but one of a set of duplicate pages. The score might also try to measure 
the query-independent quality of a page. This can for example be done by the 
PageRank measure (see below). 

An interesting algorithmic question for each of these data structures is how 
to compress the space as much as possible without affecting the average look- 
up time. Furthermore, the data structure should remain space-efficient when 
insertions and deletions of documents are allowed. 



2.3 Query Processor 

The main challenge of the query processor is to rank the documents matching 
the query by decreasing value for the user. For this purpose again a numerical 
score is assigned to each document and the documents are output in decreasing 
order of the score. 

This score is usually a combination of query-independent and query-de- 
pendent criteria. A query-independent criterion judges the document regardless 
of the actual query. Typical examples are the length, the vocabulary, publica- 
tion data (like the site to which it belongs, the date of the last change, etc.), 
and various connectivity-based techniques like the number of hyperlinks point- 
ing to a page or the PageRank score. A query- dependent criterion is a score 
which is determined only with respect to a particular query. Typical examples 
are the cosine measure of similarity used in the vector space model [34] , query- 
dependent connectivity-based techniques [25], and statistics on which answers 
previous users selected for the same query. 

The algorithmically most interesting of these techniques are query-indepen- 
dent and query-dependent connectivity-based techniques. We will describe the 
query-independent approach below. The assumption behind the connectivity- 
based ranking techniques is that a link from page A to page B means that the 
author of page A recommends page B . Of course this assumption does not always 
hold. For example, a hyperlink cannot be considered a recommendation if page 
A and B have the same author or if the hyperlink was generated for example by 
a Web-authoring tool. 
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The idea of studying “referrals” is not new. There is a subfield of classical in- 
formation retrieval, called bibliometrics, where citations were analyzed. See, e.g., 
[23,17,37,18]. The field of sociometry developed algorithms [24,30] very similar 
to the connectivity-based ranking techniques described in [6,25]. Furthermore, 
a large amount of Web-related research exploits the hyperlink structure of the 
Web. 

A Graph Representation for the Web The Web can be represented as a 
graph in many different ways. Connectivity-based ranking techniques usually 
assume the most straightforward representation: The graph contains a node for 
each page u and there exists a directed edge (u, v) if and only if page u contains 
a hyperlink to page v. 



Query-Independent Connectivity-Based Ranking The assumption of con- 
nectivity based techniques immediately leads to a simple query-independent cri- 
terion: The larger the number of hyperlinks pointing to a page the better the 
page [9]. The main drawback of this approach is that each link is equally im- 
portant. It cannot distinguish between the quality of a page pointed to by a 
number of low-quality pages and the quality of a page that gets pointed to by 
the same number of high-quality pages. Obviously it is therefore easy to make a 
page appear to be high-quality - just create many other pages that point to it. 

To remedy this problem, Brin and Page [6,31] invented the PageRank mea- 
sure. The PageRank of a page is computed by weighting each hyperlink propor- 
tionally to the quality of the page containing the hyperlink. To determine the 
quality of a referring page, they use its PageRank recursively. This leads to the 
following definition of the PageRank R{p) of a page p: 

R{p) = tin -k (1 - e) • E i?(g) / outdegree{q ) , 

{q,p) exists 



where 

— e is a dampening factor usually set between 0.1 and 0.2; 

— n is the number of pages on the Web; and 

— outdegree(q) is the number of hyperlinks on page q. 

Alternatively, the PageRank can be defined to be the stationary distribution 
of the following infinite random walk p\,p 2 ,P 3 , ■ ■ ■ , where each pi is a node in 
the graph: Each node is equally likely to be node pi. To determine node Pi+i 
a biased coin is flipped: With probability e node pi+i is chosen uniformly at 
random from all nodes in the graph, with probability 1 — e it is chosen uniformly 
at random from all nodes q such that edge (pi, q) exists in the graph. 

The PageRank measure works very well in distinguishing high-quality Web 
pages from low-quality Web pages and is used in the Google search engine^. 

http : / /www . google . com/ 
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Recent work refined the PageRank criterion and sped up its computation, see 
e.g. [33,19]. In [20,21] PageRank-like random walks were performed to sample 
Web page almost according to the PageRank distribution and the uniformly 
distribution, respectively. The goal was to compute various statistics on the 
Web pages and to compare the quality, respectively the number, of the pages in 
the indices of various commercial search engines. 

2.4 Other Algorithmic Work Related to Web Information Retrieval 

There are many other algorithmic challenges in Web information retrieval. We 
list several below, but this list is by no means complete. 

Near- duplicates: To save space in the index search engines try to determine 
near-duplicate Web pages and completely or partially near-duplicate Web hosts 
(called mirrors). See [5,7,35] for algorithms for the near-duplicates problem and 
see [3,13] for algorithms for the mirrored host problem. 

Clustering: Clustering query results has lead to an interesting application of 
suffix trees [38] . 

Web page categorization: There are various hierarchical directories of the 
Web, for example the open directory hierarchy^. They are usually constructed 
by hand. Automatically categorizing a web, i.e. placing it at a node(s) in the 
hierarchy to which it belongs, is a challenging problem. There has been a lot of 
work on text-only methods. See [10] for a first step towards a text and link-based 
approach. 

Dynamically generated Web content: Trying to “learn” to crawl dynamically 
generated Web pages is an interesting topic of future research. A first step in 
this direction is taken by [16]. 

Web graph characterization: Characterizing the Web graph has led to a se- 
quence of studies that are algorithmically challenging because of the pure mag- 
nitude of the data to be analyzed. See [26] for a survey. 

Web user behavior: Mining user query logs shows that web users exhibit a 
different query behavior than users of classical information retrieval systems. An 
analysis of different query logs is given in [22,36]. 

Modeling: Different users asking the same query can widely disagree on the 
relevance of the answers. Thus, it is not possible to prove that certain ranking 
algorithms return relevant answers at the top. However, there has been some 
recent work on trying to find appropriate models for clustering problems in 
information retrieval and to use them to explain why certain algorithms work 
well in practise. See, e.g. [32]. This is certainly an interesting area of future 
research. 

3 Conclusions 

There are many interesting algorithmic questions related to Web information 
retrieval. One challenge is to find algorithms with good performance. The per- 

® http : / /www . dmoz . org/ 
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formance of these algorithms is usually validated by experimentation. The other 
challenge is to theoretically model information retrieval problems in order to 
explain why certain algorithms perform well. 
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1 Introduction 



Computational biology is an area in applied computer science that has gained 
much attention recently. The reason is that new experimental methods in molec- 
ular biology and biochemistry have afforded entirely novel ways of inspecting the 
molecular basis of life’s processes. Experimental breakthroughs have occurred in 
quick succession, with the first completely sequenced bacterial genome being 
published in 1995 (genome length 1.83 Mio bp, 1700 genes) [8], the first eukary- 
ote yeast following in 1996 (genome length 13 Mio bp, 6275 genes) [9], the first 
multicellular organism C. elegans being sequenced in late 1998 (97 Mio bp, 19000 
genes) [5], the fruitfiy coming along this February (120 Mio bp, 13600 genes) [1] 
and man being pre-announced in April 2000. Several dozen completely sequenced 
microbial genomes are available today. This puts biology on a completely new 
footing since, for the first time, it can be ascertained not only which components 
are necessary to administer life but also which ones suffice. 

But the experimental scene has moved beyond sequencing whole genomes. 
Today, it is possible to measure the protein populations inside a cell and to do so 
specific to certain cell types (tissues) and cell states (normal, stressed, diseased 
etc.) In so-called differential expression displays, the relative count of different 
proteins is monitored comparing two different cell states [6]. The experimental 
data are still quite noisy but, for the first time in history, it is now possible 
to query a cell as to the role of the proteins that it manufactures. This has 
interested many researchers and pharmaceutical companies in exploiting these 
data to search for fundamentally new approaches to the diagnosis and therapy of 
diseases. It is this not only scientifically but also economically quite important 
background that motivates us to discuss computational biology within a phar- 
maceutical context for the purpose of this article. Rather than giving a detailed 
description of a particular problem in computational biology, we aim at putting 
into context the set of problems to be solved and the solution methods applied. 



M. Paterson (Ed.): ESA 2000, LNCS 1879, pp. 9-19, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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2 Two Stages in Drug Development 

The development of a new drug is performed in two basic steps. The first is the 
identification of a key molecule, usually a protein, the so-called target protein, 
whose biochemical function is causative of the disease. The second step is the 
search for or development of a drug that moderates - often blocks ~ the function 
of the target protein. 




Fig. 1. 3D structure of the protein dihydrofolate-reductase (DHFR) 



Figure 1 shows the three-dimensional shape of the protein dihydro folate- 
reductase (DHFR) which catalyzes a reaction that is important in the cell divi- 
sion cycle. DHFR has a prominent binding pocket in its center that is specifically 
designed to bind to the substrate molecule dihydrofolate and induce a small mod- 
ification of this molecule. This activity of DHFR can be blocked by administering 
the drug molecule methotrexate (MTX) (Figure 2). MTX binds tightly to DHFR 
and prevents the protein from exercising its catalytic function. MTX is a com- 
monly administered drug in cancer treatment, where our goal is to break the 
(uncontrolled) cell division cycle. 

This example shows both the benefits and the problems of current drug 
design. Using MTX, we can in fact break the cell division cycle. However, DHFR 
is actually the wrong target molecule. It is expressed in all dividing cells, thus a 
treatment with MTX not only affects the tumor but all dividing cells in the body. 
This leads to severe side effects such as losing one’s hair and intestinal lining. 
What we need is a more appropriate target protein - one that is specifically 
expressed inside the tumor and whose inhibition does not cause side effects in 
other tissues. 
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Fig. 2. The inhibitor methotrexate (MTX) bound to the binding pocket of 
DHFR. 



3 Search for Target Proteins 

Until a few years ago target proteins could not be systematically searched for. 
Now new experimental methods in molecular biology have rendered a cell-wide 
search for target proteins feasible. Today, we can search for a suitable target 
protein among a few ten thousand protein candidates. The selection of promising 
candidates from such a large set of molecules can only be done with the help of 
the computer. 

The basis for such screens are so-called microarrays or DNA-chips (Figure 
3). The actual image which can be downloaded from the website mentioned in 
the figure caption is an additive overlay of two pictures in the colors green (cell 
state 1: yeast in the presence of glucose) and red (cell state 2: yeast deprived 
of glucose), respectively. Bright dots correspond to proteins that are expressed 
highly (in large numbers) in the respective cell state. 

In principle, microarrays afford a comprehensive picture of the protein state 
of the cell. (Actually what is measured here are mRNA levels, not the proteins. 
We comment further on this biological detail in the Summary of this article.) 
The problem is that we learn very little about each protein: its expression level 
and its amino-acid sequence (or even only part of the sequence) are the only 
data we are given. In order to select a suitable target protein, we need to know 
much more. Where does the protein occur in the cell (cellular localization)? It 
would help to know the 3D structure. What is the protein function? What are 
its binding partners? What biochemical pathway is it involved in? These are 
all very difficult questions whose exact answer would entail extensive laboratory 
procedures over a long time period with a lot of human resources - something we 
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Fig. 3. A DNA chip containing all yeast genes (taken from 
http://cmgm.stanford.edu/pbrown/explore/index.html). 



can afford to do with one or a few proteins but not with hundreds and thousands 
of interesting protein candidates coming from a DNA-chip experiment. 

Therefore, these questions have to be answered by computer. Usually, a first 
step is to analyze the protein sequences in order to elucidate evolutionary re- 
lationships. This is done by a version of sequence alignment. If a high level 
similarity of the protein sequence under study to some sequence in a database is 
established then we can hope to infer the function of the sequence under study 
from biological knowledge of the sequence hit in the database. The process of 
inferring this kind of biological information is not automated, however. 

Often, proteins are assembled from building blocks called domains. Thus, 
identifying the domains of a protein is a first problem to be solved in sequence 
analysis [2]. 

As the level of sequence similarity decreases, inferring function from sequence 
alone become increasingly hazardous. Thus, it helps to predict the 3D structure 
of the protein. At GMD, we have developed different methods for doing this. 
One program (123D) is based on an efficient dynamic programming scheme and 
can make a protein structure prediction within minutes [15]. Another program 
(RDP) is based on a more detailed NP-hard formulation of the protein folding 
problem and runs within hours [24] . From the protein structure, we can attempt 
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to reason about the function, again in a process that has not been automated, 
so far. 

Besides the protein structure, other sources of information can help to infer 
function such as the use of phylogenetic analyses [14] and the analysis of which 
protein domains are fused or separated in different organisms [12]. The localiza- 
tion of the protein inside the cell can be predicted by a special kind of sequence 
analysis [13]. 

Finally, we have developed a method to expand hypotheses on the local- 
ization of proteins in biochemical pathways using differential expression data. 
This method involves a statistical correlation analysis between the differential 
expression levels of the protein under study and those of the proteins that we 
have already assigned to the biochemical pathway [26]. 

4 Demands on Computational Biology 

Of course, nature is much too complex to be modeled to any sufficiently accurate 
detail. And we have little time to spend on each molecular candidate. Thus we 
mostly do not even attempt to model things in great physical detail, but we 
use techniques from statistics and machine learning to infer “signals” in the 
data and separate them from “noise” . Just as people interpret facial expressions 
of their dialog partners not by a thorough physiological analysis that reasons 
backwards from the shape of the muscles to the neurological state of the brain 
but learn on (usually vast amounts of) data how to tell whether somebody is 
happy or sad, attentive or bored, so do computational biology models query 
hopefully large sets of data to infer the signals. Here signal is a very general 
notion that can mean just about anything of biological interest, from a sequence 
alignment exhibiting the evolutionary relationship of the two proteins involved 
over a predicted 2D or 3D structure of a protein to the structure of a complex 
of two molecules binding to each other. On the sequence level, the splice sites 
in complex eukaryotic genes, the location and makeup of promoter sequences or 
the design of signal peptides giving away the final location of the protein in the 
cells are examples of interesting signals. 

Statistical methods that are used to learn from biological data have classically 
included neural nets and genetic algorithms. Hidden-Markov models [11,7] are 
a very popular method of generating models for biological signals of all kinds. 
Recently support vector machines have been applied very successfully to solving 
classification problems in computational biology [10,25]. 

As the methods of analysis are inexact so are the results. The analyses yield 
predictions that cannot be trusted, in general. This is quite different from the 
usual situation in theoretical computer science, where you are either required to 
compute the optimum solution or, at least, optimum means something and so 
does the distance of the computed solution to the optimum, in case that you 
do not hit the optimum. Not so here. Cost functions in computational biology 
usually miss the goal. Notions such as evolutionary distance or free energy are 
much too complex to be reflected adequately by easy-to-compute cost functions. 
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Thus, computational biology is dominated by the search for suitable cost func- 
tions. Those cost function can be trained, just as the models in toto. We have 
used a training procedure based on linear programming to improve the predic- 
tive power of our protein structure prediction methods [27]. Another possibility 
is to leave the mystical parameters in the cost function variable and study the 
effect of changing them on the outcome. A method for doing this in the area of 
sequence alignment is presented in [28]. 

Whether a method or a cost function is good or bad cannot be proved but 
has to be validated against biologically interpreted data that are taken as a gold 
standard for purposes of the validation. Several respective data sets have evolved 
in different bioinformatics domains. Examples are the SCOP (http://scop.mrc- 
lmb.cam.ac.uk/scop/) and CATH (http://www.biochem.ucl.ac.uk/bsm/cath/) 
structural protein classifications for validating methods for protein structure 
prediction and analysis. These sets are not only taken to validate the different 
methods but also to compare them community- wide. 

For more details on algorithmic aspects of computational biology, see also 

[ 20 ]. 

Validating methods and cost functions on known biological data has a serious 
drawback. One is not prepared to answer the question whether one has used the 
biological knowledge in the method, either on purpose or inadvertently. There- 
fore, the ultimate test of any computational biology methods is a “blind pre- 
diction” , one that convincingly makes a prediction without previous knowledge 
of the outcome. To stage a blind prediction experiment involves a certification 
authority that vouches for the fact that the knowledge to be predicted was not 
known to the predictor. The biannual GASP (Critical Assessment of Structure 
Prediction Methods [3,4] experiment series that was started in 1994, performs 
this task for protein structure prediction methods. The GASP team provides 
a world-wide clearing house for protein sequences whose structures are in the 
process of being resolved, e.g. by crystallographers. The group that resolves the 
structure communicates the protein sequence to the GASP team that puts it 
on the web up for prediction. Sufficiently before the crystallographers resolve 
the structure, the prediction contest closes on that sequence. After the structure 
is resolved it is compared with the predictions. GASP has been a tremendous 
help in gaining acknowledgement for the scientific discipline of protein structure 
prediction. 

5 Searching For New Drugs 

The search for drugs that bind to a given target protein also has been system- 
atized greatly with the advent of very efficient methods for synthesizing new 
compounds (combinatorial chemistry) and testing their binding properties to 
the protein target (high-throughput screening). Gombinatorial libraries provide 
a carefully selected set of molecular building blocks - usually dozens or hundreds 
- together with a small set of chemical reactions that link the modules. In this 
way, a combinatorial library can theoretically provide a diversity of up to billions 
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of molecules from a small set of reactants. Up to millions of these molecules can 
be synthesized daily in a robotized process and submitted to chemical test in a 
high-throughput screening procedure. In our context, the objective of the test is 
to find out which compounds bind tightly to a given target protein. 

Here we have a similar situation as in the search for target proteins. We have 
to inspect compounds among a very large set of molecular candidates, in order to 
select those that we want to inspect further. Again, computer help is necessary 
for preselection of molecular candidates and interpretation of experimental data. 

In the computer, finding out whether a drug binds tightly to the target 
protein can best be done on the basis of knowledge of the protein structure. 
If the spatial shape of the site of the protein to which the drug is supposed 
to bind is known, then we can apply docking methods to select suitable lead 
compounds which have the potential of being refined to drugs. The speed of a 
docking method determines whether the method can be employed for screen- 
ing compound databases in the search for drug leads. At GMD, we developed 
the docking method FlexX that takes a minute per instance and can be used 
to screen up to thousands of compounds on a PC or hundreds of thousands 
of drugs on a suitable parallel computer. Docking methods that take the bet- 
ter part of an hour cannot suitably be employed for such large scale screening 
purposes. In order to screen really large drug databases with several hundred 
thousand compounds or more we need docking methods that can handle single 
protein/drug pairs within seconds. The high conformational flexibility of small 
molecules as well as the subtle structural changes in the protein binding pocket 
upon docking (induced fit) are major complications in docking. Furthermore, 
docking necessitates careful analysis of the binding energy. The energy model 
is cast into the so-called scoring function that rates the protein-ligand complex 
energetically. Challenges in the energy model include the handling of entropic 
contributions, solvation effects, and the computation of long-range forces in fast 
docking methods. 

Here is a summary of the state of the art in docking. With our program FlexX, 
handling the structural flexibility of the drug molecule can be done within the 
regime up to about a minute per molecular complex on a PC [22] . (FlexX has also 
been adapted to screening combinatorial drug libraries [23].) A suitable analysis 
of the structural changes in the protein still necessitates more computing time. 
Today, tools that are able to dock a molecule to a protein within seconds are 
still based on rigid-body docking (both the protein and ligand conformational 
flexibility is omitted) [18]. 

Even if the structure of the protein binding site is not known, we can use 
computer-based methods to select promising lead compounds. Our program 
FlexS compares the structure of a molecule with that of a ligand that is known 
to bind to the protein, for instance, its natural substrate [19]. The runtime is 
again between one and two minutes on a PC. 

Finally, the program Ftrees [21] can speed up this comparison to 50 ms per 
molecular pair, on the average. Molecular structure is still analyzed, on some 
level, which is an exception in this time-regime. Drug molecules are abstracted 
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to tree-structures, for this purpose. Heuristic tree matching algorithms provide 
the basis for the molecular comparison. 

The accuracy of docking predictions lies within 50% to 80% “correct” pre- 
dictions depending on the evaluation measure and the method [17]. That means, 
that docking methods are far from perfectly accurate. Nevertheless, they are 
very useful in pharmaceutical practice. The major benefit of docking is that a 
large drug library can be ranked with respect to the potential that its molecules 
have for being a useful lead compound for the target protein in question. The 
quality of a method in this context can be measured by an enrichment factor. 
Roughly, this is the ratio between the number of active compounds (drugs that 
bind tightly to the protein) in a top fraction (say the top 1%) of the ranked drug 
database divided by the same figure in the randomly arranged drug database. 
State-of-the-art docking methods in the middle regime (minutes per molecular 
pair), e.g. FlexX, achieve enrichment factors of up to about 15. Faster meth- 
ods, e.g. FeatureTrees, achieve similar enrichment factors, but deliver molecules 
more similar to known binding ligands and do not detect such a diverse range 
of binding molecules. 

Besides the savings in time and resources, another advantage of virtual 
screening of drug libraries over high-throughput screening in the laboratory is 
that the computer models produced can be inspected in order to find out why 
certain drugs bind tightly to the protein and others do not. This insight can be 
a fruitful source of further pharmaceutical innovation. 

6 Summary 

Computational biology is an extremely interesting and demanding branch of ap- 
plied computer science. This article could only touch upon a few research topics 
in this complex field. Computational biology is a very young field. The biological 
systems under study are not very well understood yet. Models are rough, data 
are voluminous but often noisy. This limits the accuracy of computational bi- 
ology predictions. However, the analyses improve quickly, due to improvements 
on the algorithmic and statistical side and to the accessibility to more and bet- 
ter data. Nevertheless, computational biology can be expected to be a major 
challenge for some time to come. 

New kinds of experimental data are already in sight. These include protein 
data from mass spectrometry experiments. With these data, we can look at pro- 
teins not only on the transcription (mRNA) level but also, as they have been 
modified after their synthesis. These protein modifications, which include gly- 
cosylation (attachment of large tree-like sugar molecules to the protein surface) 
and phosphorylation (attachment of phosphate groups that often activate pro- 
teins), greatly influence protein function. Computational biology has not really 
advanced to analyzing these post-translational modifications of proteins yet. 

Another emerging area is that of studying the genomic differences between in- 
dividuals in a species, e.g., between different people. Of course, the susceptibility 
to disease and the receptivity to drug treatment finally is based on these dif- 
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ferences. Genomic differences between people amount to single nucleotide poly- 
morphisms (SNPs), i.e., changes in single bases in the genomic text every few 
thousand letters. These variations have to be analyzed with computational bi- 
ology methods. Eventually, these activities will result in today’s computational 
biology merging with the field of statistical genetics and epidemiology. 

One important point that we want to stress in the end is this. Computational 
biology is a field which needs much basic research. At the same time, however, 
the pressure for immediate biological innovation is tremendous. In this situation, 
the impact of computational biology research critically depends on an accurate 
understanding of the biological process which is to be empowered by the compu- 
tational biology contribution. It is critical to ask the right questions, and often 
modeling takes priority over optimization. Lastly, we need people that under- 
stand and love both computer science and biology to bring the field forward. 
Fortunately, it seems that a growing number of people discover their interest in 
both disciplines that make up computational biology. 
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optimization, Proceedings of the Fourth Annual Conference on Research 
in Computational Molecular Biology (RECOMB’OO), ACM Press (2000) 

- Offers a combinatorial method for optimizing hard-to-set parameters in 
protein folding programs. 14 

28. R. Zimmer, T. Lengauer, Fast and Numerically Stable Parametric Align- 
ment of Biosequences. Proceedings of the First Annual Conference on Re- 
search in Computational Molecular Biology (RECOMB’97) (1997) 344-353. 

- Solves an algorithmically challenging parametric version of the pairwise 
sequence alignment problem. 14 
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Abstract. Several algorithms for computing the Minkowski sum of two 
polygons in the plane begin by decomposing each polygon into convex 
subpolygons. We examine different methods for decomposing polygons by 
their suitability for efficient construction of Minkowski sums. We study 
and experiment with various well-known decompositions as well as with 
several new decomposition schemes. We report on our experiments with 
the various decompositions and different input polygons. Among our 
findings are that in general: (i) triangulations are too costly (ii) what 
constitutes a good decomposition for one of the input polygons depends 
on the other input polygon — consequently, we develop a procedure for 
simultaneously decomposing the two polygons such that a “mixed” ob- 
jective function is minimized, (iii) there are optimal decomposition algo- 
rithms that significantly expedite the Minkowski-sum computation, but 
the decomposition itself is expensive to compute — in such cases sim- 
ple heuristics that approximate the optimal decomposition perform very 
well. 



1 Introduction 

Given two sets P and Q in their Minkowski sum (or vector sum), denoted 
by P © <5, is the set {p + q \ p & P,q & Q}- Minkowski sums are used in a wide 
range of applications, including robot motion planning [19], assembly planning 
[11], computer-aided design and manufacturing (CAD/CAM) [6], and geographic 
information systems. 
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Motivated by these applications, there has been much work on obtaining 
sharp bounds on the size of the Minkowski sum of two sets in two and three 
dimensions, and on developing fast algorithms for computing Minkowski sums. 
It is well known that if P is a polygonal set with m vertices and Q is another 
polygonal set with n vertices, then P 0 Q is a portion of the arrangement of mn 
segments, where each segment is the Minkowski sum of a vertex of P and an edge 
of Q, or vice-versa. Therefore the size of P©<5 is 0{m?n?) and it can be computed 
within that time; this bound is tight in the worst case [14]. If P is convex, then 
a result of Kedem et al. [15] implies that P ®Q has 0{mn) vertices, and it can 
be computed in 0{mn\og{mn)) time [20]. If both P and Q are convex, then 
P0 Q is a convex polygon with at most m + n vertices, and it can be computed 
in 0{m + n) time [19]. In motion-planning applications, one is often interested 
in computing only a single connected component of the complement of P © Q. 
Faster algorithms are known for computing a single connected component of the 
complement of P © Q [12,23]. A summary of the known results on computing 
the Minkowski sum of two sets in three and higher dimensions can be found in 
a recent survey by Agarwal and Sharir [2] . 

We devised and implemented several algorithms for computing the Minkowski 
sum of two simple polygons^ based on the CGAL software library [1]. Our main 
goal was to produce a robust and exact implementation. This goal was achieved 
by employing the CGAL planar map package [9] while using exact number types. 
We are currently using our software to solve translational motion planning prob- 
lems in the plane. We are able to compute collision-free paths even in environ- 
ments cluttered with obstacles, where the robot could only reach a destination 
placement by moving through tight passages, practically moving in contact with 
the obstacle boundaries. This is in contrast with most existing motion planning 
software for which tight or narrow passages constitute a significant hurdle. For 
more details on our algorithms and implementation see [8]; we briefly describe 
the algorithms in Section 2 below. 

The robustness and exactness of our implementation come at a cost: they 
slow down the running time of the algorithms in comparison with a more stan- 
dard implementation that uses floating point arithmetic. This makes it especially 
necessary to try and speed up the algorithms in other ways. All our algorithms 
start with decomposing the input polygons into convex subpolygons. We discov- 
ered that not only the number of subpolygons in the decomposition of the input 
polygons but also their shapes had dramatic effect on the running time of the 
Minkowski-sum algorithms; see Figure 1 for an example. 

In this paper we examine different methods for decomposing polygons by 
their suitability for efficient construction of Minkowski sums. In the theoretical 
study of Minkowski-sum computation (e.g., [15]), the choice of decomposition 
is often irrelevant (as long as we decompose the polygons into convex subpoly- 
gons) because it does not affect the worst-case asymptotic running time of the 



^ Our Minkowski sum software works for general complex polygonal sets including 
polygons with holes. Although not all the decomposition methods discussed in this 
paper extend to polygons with holes, many of them are easy to extend. 
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naive triang. 


min Edf triang. 


min convex 


Edi 


754 


530 


192 


# of convex subpolygons in P 


33 


33 


6 


time (mSec) to compute P (B Q 


2133 


1603 


120 



Fig. 1. Different decomposition methods applied to the polygon P (leftmost in 
the figure), from left to right: naive triangiilation, minimum Edf triangulation 
and minimum convex decomposition (the details are given in Section 3). We can 
see in the table for each decomposition the sum of squares of degrees, the number 
of convex subpolygons, and the time in milliseconds to compute the Minkowski 
sum of the polygon with a small convex polygon, Q, with 4 vertices. 

algorithms. In practice however, as we already mentioned above, different de- 
compositions can induce a large difference in running time of the Minkowski- 
sum algorithms. The decomposition can affect the running time of algorithms 
for computing Minkowski sums in several ways: some of them are global to all 
algorithms that decompose the input polygons into convex polygons, while some 
others are specific to certain algorithms or even to specific implementations. We 
examine these various factors and report our findings below. 

Polygon decomposition has been extensively studied in computational ge- 
ometry; it is beyond the scope of this paper to give a survey of results in this 
area and we refer the reader to the survey papers by Keil [18] and Bern [3], 
and the references therein. As we proceed, we will provide details on specific 
decomposition methods that we will be using. 

In the next section we survey the Minkowski sum algorithms that we have 
implemented. In Section 3 we describe the different decomposition algorithms 
we have implemented. We present a first set of experimental results in Section 4 
and filter out the methods that turn out to be inefficient. In Section 5 we focus 
on the decomposition schemes that are not only fast to compute but also help 
to compute the Minkowski sum efficiently. 

2 Minkowski Sum Algorithms 

Given a collection C of curves in the plane, the arrangement A(C) is the sub- 
division of the plane into vertices, edges and faces induced by the curves in C. 
Planar maps are arrangements where the curves are pairwise interior disjoint. 
Our algorithms for computing Minkowski sums rely on arrangements and planar 
maps, and in the discussion below we assume some familiarity with these struc- 
tures, and with a refinement thereof called the vertical decomposition] we refer 
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the reader to [2,10,25] for information on arrangements and vertical decomposi- 
tion, and to [9] for a detailed description of the planar map package in CGAL 
on which our algorithms are based. 

The input to our algorithms are two simple polygons P and Q, with m and 
n vertices respectively. Our algorithms consist of the following three steps: 

Step 1: Decompose P into the convex subpolygons Pi, P 2 , • ■ • ,Ps and Q into 
the convex subpolygons Qi, Q 2 , ■ • ■ ,Qt- 

Step 2 : For each i G [l.-s] and for each j G [l..t] compute the Minkowski subsum 
Pi 0 Qj which we denote by Rij. We denote by R the set {Ri j \ i G [l.-s], j G 
[l..t]}. 

Step 3: Construct the union of all the polygons in R, computed in Step 2; the 
output is represented as a planar map. 

The Minkowski sum of P and Q is the union of the polygons in R. Each Rij 
is a convex polygon, and it can easily be computed in time that is linear in the 
sizes of Pi and Qj [19]. Let k denote the overall number of edges of the polygons 
in R, and let I denote the overall number of intersections between (edges of) 
polygons in R 

We briefly present two different algorithms for performing Step 3, computing 
the union of the polygons in R, which we refer to as the arrangement algorithm 
and the incremental union algorithm. A detailed description of these algorithms 
is given in [8]. 

Arrangement algorithm. The algorithm constructs the arrangement A(P) 
induced by the polygons in R (we refer to this arrangement as the underlying 
arrangement of the Minkowski sum) by adding the (boundaries of the) polygons 
of R one by one in a random order and by maintaining the vertical decomposition 
the arrangement of the polygons added so far; each polygon is chosen with 
equal probability at each step. Once we have constructed the arrangement, we 
efficiently traverse all its cells (vertices, edges, or faces) and we mark a cell as 
belonging to the Minkowski sum if it is contained inside at least one polygon of 
R. The construction of the arrangement takes randomized expected time 0(1 + 
klogk) [22]. The traversal stage takes 0(1 + k) time. 

Incremental union algorithm. In this algorithm we incrementally construct 
the union of the polygons in R by adding the polygons one after the other 
in random order. We maintain the planar map representing the partial union 
of polygons in R. For each r G R we insert the edges of r into the map and 
then remove redundant edges from the map. All these operations can be carried 
out efficiently using the planar map package. We can only give a naive bound 
O(k^log^k) on the running time of this algorithm, which in the worst case is 
higher than the worst-case running time of the arrangement algorithm. Prac- 
tically however the incremental union algorithm works much better on most 
problem instances. 

Remarks. (1) We also implemented a union algorithm using a divide-and- 
conquer approach but since it mostly behaves worse than the incremental algo- 
rithm we do not describe it here. The full details are given in [8]. (2) Our planar 
map package provides full support for maintaining the vertical decomposition. 
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and for efficient point location in a map. However, using simple point location 
strategies (naive, walk-along-a-line) is often faster in practice [9]. Therefore we 
ran the tests reported below without maintaining the vertical decomposition. 



3 The Decomposition Algorithms 

We briefly describe here different algorithms that we have implemented for de- 
composing the input polygons into convex subpolygons. We used both decom- 
position with or without Steiner points. Some of the techniques are optimal and 
some use heuristics to optimize certain objective functions. The running time 
of the decomposition stage is significant only when we search for the optimal 
solution and use dynamic programming; in all other cases the running time of 
this stage is negligible even when we implemented a naive solution. Therefore 
we only mention the running time for the “heavy” decomposition algorithms. In 
what follows P is a polygon with n vertices pi, . . . ,Pn, r of which are reflex. 



3.1 Triangulation 

Greedy triangulation. This procedure searches for a pair of vertices Pi,Pj such 
that the segment piPj is a diagonal, namely it lies inside the polygon. It adds 
such a diagonal, splits the polygon into two subpolygons by this diagonal, and 
triangulates each subpolygon recursively. The procedure stops when the polygon 
becomes a triangle. See Figure 1 for an illustration. 

In some of the following decompositions we are concerned with the degrees 
of vertices in the decomposition (namely the number of diagonals incident to a 
vertex). Our motivation for considering the degree comes from an observation 
on the way our planar map structures perform in practice: we noted that the 
existence of high degree vertices in the decomposition results in high degree 
vertices in the arrangement, which makes the maintenance of union of polygons 
slower (see the full version for details). 

Optimal triangulation — minimizing the maximum degree. Using dy- 
namic programming we compute a triangulation of the polygon where the maxi- 
mum degree of a vertex is minimal. The algorithm is described in [13], and runs 
in O(n^) time. 

Optimal triangulation — minimizing Edj . We adapted the minimal-maxi- 
mum-degree algorithm to find the triangulation with minimum Sdf where di is 
the degree of vertex Vi of the polygon. See Figure 1. 

3.2 Convex Decomposition without Steiner Points 

Greedy convex decomposition. The same as the greedy triangulation algo- 
rithm except that it stops as soon as the polygon does not have a reflex vertex. 
Minimum number of convex subpolygons (min-convex). We apply the 
algorithm of Keil [16] which computes a decomposition of a polygon into the 
minimum number convex subpolygons without introducing new vertices (Steiner 
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Fig. 2. From left to right: Slab decomposition, angle bisector (AB) decomposi- 
tion, and KD decomposition 



points). The running time of the algorithm is 0{nr^ logn). This algorithm uses 
dynamic programming. See Figure 1. This result was recently improved to 0{n+ 
r^min{r^,n}) [17]. 

Minimum convex decomposition. We modified Keil’s algorithm so that 
it will compute decompositions that minimize Sdf, the sum of squares of vertex 
degree. 

3.3 Convex Decomposition with Steiner Points 

Slab decomposition. Given a direction e, from each reflex vertex of the poly- 
gon we extend a segment in directions e and — e inside the polygon until it hits 
the polygon boundary. The result is a decomposition of the polygon into con- 
vex slabs. If e is vertical then this is the well-known vertical decomposition of 
the polygon. See Figure 2. The obvious advantage of this decomposition is its 
simplicity. 

Angle bisector decomposition (AB). In this algorithm we extend the inter- 
nal angle bisector from each reflex vertex until we first hit the polygon’s boundary 
or a diagonal that we have already extended from another vertex. This decompo- 
sition gives a 2-approximation to the optimal convex decomposition as described 
in [4]. See Figure 2. 

KD decomposition. This algorithm is inspired by the KD-tree method to 
partition a set of points in the plane [5]. See Figure 2. By this method we try to 
lower the stabbing number of the subdivision (namely, the maximum number of 
subpolygons in the subdivision intersected by any line) — the detailed discussion 
can be found at [8]. 

4 A First Round of Experiments 

Our implementation of the Minkowski sum package is based on the CGAL (ver- 
sion 2.0) [7] and LEDA (version 4.0) [21] libraries. Our package works with Linux 
(g-k- 1- compiler) as well as with WinNT (Visual G-k- 1- 6.0 compiler). The tests 
were performed under WinNT workstation on an (unloaded) 500 MHz Pentiu- 
mlll machine with 128 Mb of RAM. We measured the running times for the 
various algorithms with different input data. 

The input data that we present here is just a small representative sample of 
the polygonal sets on which tests were performed. As mentioned in the Introduc- 
tion, the complexity of the Minkowski sum can range from 0(m + n) (e.g., two 
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Fig. 3. Star input: The input (on the left-hand side) consists of two star-shaped 
polygons. The underlying arrangement of the Minkowski sum is shown in the 
middle. Running times in seconds for different decomposition methods (for two 
star polygons with 20 vertices each) are presented in the graph on the right-hand 
side. 



convex polygons) to (e.g. the fork input — Figure 4). Tests on two con- 

vex polygons are meaningless in our context. The “intermediate” inputs (star 
shaped polygons — Figure 3, random looking polygons) are interesting in that 
there are many different ways to decompose them. 



4.1 Results and discussion 

We ran both union algorithms (arrangement and incremental-union) with all 
nine decomposition methods on the input data. The running times for the com- 
putation of the Minkowski sum for two input examples are summarized in Fig- 
ures 3 and 4. Additional experimental results can be found at [8] and 
http : //www .math . tau. ac . il/~f lato/TriminkWeb/. 

It is obvious from the experimental results that triangulations result in poor 
union running times (the left three pairs of columns in the histograms of Fig- 
ures 3 and 4). By triangulating the polygons, we create (n — l)(m — 1) hexagons 
in R with potentially intersections between the edges of these poly- 

gons. We get those poor results since the performance of the union algorithms 
strongly depends on the number of vertices in the arrangement of the hexagon 
edges. Minimizing the maximum degree or the sum of squares of degrees in a 
triangulation is a slow computation that results in better union performance 
(compared to the naive triangulation) but is still much worse than other simple 
convex-decomposition techniques . 

In most cases the arrangement union algorithm runs much slower than the 
incremental union approach. By removing redundant edges from the partial sum 
during the insertion of polygons, we reduce the number of intersections of new 
polygons and the current planar map features. The fork input is an exception 
since the complexity of the union is roughly the same as the complexity of 
the underlying arrangement and the edges that we remove in the incremental 
algorithm do not significantly reduce the complexity of the planar map; see 
Figure 4. 
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Fig. 4. Fork input: the input (on the left-hand side) consists of two orthogo- 
nal fork polygons. The Minkowski sum in the middle has complexity 0{m^n?). 
The running times in seconds for different decomposition methods (for two fork 
polygons with 8 teeth each) are shown in the graph on the right-hand side. 




Fig. 5. On the left: when using the min-convex decomposition the union compu- 
tation time is the smallest but it becomes inefficient when counting the decompo- 
sition time as well (running times for two star polygons with 100 vertices each). 
On the right: average union running times for star inputs with the improved 
decomposition algorithms. 

The min-convex algorithm almost always gives the best union computation 
time but constructing this optimal decomposition may be expensive — see Fig- 
ure 5. Minimizing the sum of squares of degrees in a convex decomposition rarely 
results in a decomposition that is different from the min-convex decomposition. 

This first round of experiments helped us to filter out inefficient methods. 
In the next section we focus on the better decomposition algorithms (i.e., min 
convex, slab, angle bisector), we further study them and attempt to improve 
their performance. 

5 Revisiting the More Efficient Algorithms 

In this section we focus our attention on the algorithms that were found to 
be efficient in the first round of experiments. We present an experiment that 
shows that, contrary to the impression that the first round of results may give, 
minimizing the number of convex subpolygons in the decomposition does not 
always lead to better Minkowski-siim computation time. We also show in this 
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mi 

Fig. 6. Left to right: knife input polygons {P above Q), two types of decom- 
positions of P (enlarged), the Minkowski sum and the underlying arrangement 
(generated using the “short” decomposition). 

section that in certain instances the decision how to decompose the input polygon 
P may change depending on the other polygon Q. 




5.1 Nonoptimality of Min-convex Decompositions 

Minimizing the number of convex parts of P and Q can be expensive to com- 
pute, but it does not always yield the best running time of the Minkowski sum 
construction. In some cases other factors are important as well. Consider for 
example the knife input data. P is a long triangle with j teeth along its base 
and Q is composed of horizontal and vertical teeth. See Figure 6. P can be 
decomposed into j -I- 1 convex parts by extending diagonals from the teeth in 
the base to the apex of the polygon. Alternatively, we can decompose it into 
j + 2 convex subpolygons with short diagonals (this is the “minimal length AB” 
decomposition described below in Section 5.3). If we fix the decomposition of 
Q, the latter decomposition of P results in considerably faster Minkowski-sum 
running time, despite having more subpolygons, because the Minkowski sum 
of the long subpolygons in the first decomposition with the subpolygons of Q 
results in many intersections between the edges of polygons in R. In the first 
decomposition we have j + 1 long subpolygons while in the latter we have only 
one “long” subpolygon and j + 1 small subpolygons. 

5.2 Mixed objective function 

Good decomposition techniques that handle P and Q separately might not be 
sufficient because what constitutes a good decomposition of P depends on Q. We 
measured the running time for computing the Minkowski sum of a knife polygon 
P (Figure 6) and a random looking polygon Q. We scaled Q differently in each 
test. We fixed the decomposition of Q and decomposed the knife polygon P once 
with the short j + 2 “minimal length AB” decomposition and then with the long 
j + \ minimum convex decomposition. The results are presented in Figure 7. 
We can see that for small Q’s the short decomposition of the knife P with more 
subpolygons performs better but as Q grows the long decomposition of P with 
fewer subpolygons wins. 

These experiments imply that a more careful strategy would be to simulta- 
neously decompose the two input polygons, or at least take into consideration 
properties of one polygon when decomposing the other. 
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Fig. 7. Minkowski sum of a knife, P, with 22 vertices and a random polygon, Q, 
with 40 vertices using the arrangement union algorithm. On the left-hand side the 
underlying arrangement of the sum with the smallest random polygon and on the 
right-hand side the underlying arrangement of the sum with the largest random 
polygon. As Q grows, the number of vertices I in the underlying arrangement 
is dropping from (about) 15000 to 5000 for the “long” decomposition of P, and 
from 10000 to 8000 for the “short” decomposition. 



The running time of the arrangement union algorithm is 0{I+k log k) , where 
k is the number of edges of the polygons in R and I is the overall number of inter- 
sections between (edges of) polygons in R (see Section 2). The value of k depends 
on the complexity of the convex decompositions of P and Q. Hence, we want to 
keep this complexity small. It is harder to optimize the value of /. Intuitively, we 
want each edge of R to intersect as few polygons of R as possible. Considering the 
standard rigid-motion invariant measure on lines in the plane [24] , we developed 
the following cost function c{Dp^q) of a simultaneous convex decomposition of 
P and Q. Let ttp, kp and Ap he the perimeter, number of convex subpolygons 
and the sum of the lengths of the diagonals of P, respectively. Similarly define 
ttq, kg and Ag. Then c{Dp^g) = kg{2Ap + ttp) -|- kp(2Ag + ttq), which is the 
sum of the perimeters of the polygons in R. 

If we do not allow Steiner points, we can modify the dynamic-programming 
algorithm by Keil [16] to compute a decomposition that minimizes this cost 
function in time, where rp (rg) is the number of reflex vertices 

of P (resp. Q). We did not make any serious attempt to improve the running 
time. Because of lack of space we omit the details; see further information in [8]. 

If we allow Steiner points, then it is an open question whether an optimal de- 
composition can be computed in polynomial time. Currently, we do not even have 
a constant-factor approximation algorithm. The difficulty arises because unlike 
the minimum-size decomposition for which an optimal algorithm is known [4], 
no constant-factor approximation is known for minimum-length convex decom- 
position of a simple polygon if Steiner points are allowed. 



5.3 Improving the AB method 

Most of the tests suggest that in general the AB decomposition algorithm works 
better than the other heuristics. We next describe our attempts to improve this 
algorithm. 
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Minimal length angle bisector decomposition. In each step we handle 
one reflex vertex. For a reflex vertex we look for one or two diagonals that will 
eliminate it. We choose the shortest combination among the eliminators we have 
found. As we can see in Figure 5, the minimal length AB decomposition performs 
better than the naive AB even though it generally creates more subpolygons. 

We tried to further decrease the number of convex subpolygons generated by 
the decomposition algorithm. Instead of emanating a diagonal from any reflex 
vertex, we first tested whether we can eliminate two reflex vertices with one 
diagonal (we call such a diagonal a 2-reflex eliminator). All the methods listed 
below generate at most the same number of subpolygons generated by the AB 
algorithm but practically the number is likely to be smaller. 

Improved angle bisector decomposition. At each step, we choose a reflex 
vertex v and search for a 2-reflex eliminator incident upon ri. If we cannot And 
such a diagonal, we continue as in the standard AB algorithm. 

Reflex angle bisector decomposition. At each step, we check all pairs of 
reflex vertices to And a 2-reflex eliminator diagonal. When there are no more 
2-reflex eliminators, we continue with the standard AB algorithm on the rest of 
the reflex vertices. 

Small side angle bisector decomposition. As in the reflex AB decompo- 
sition, we are looking for 2-reflex eliminators. Such an eliminator decomposes 
the polygon into two parts, one on each side. Among the candidate eliminators 
we choose the one that has the minimal number of reflex vertices on one of its 
sides. By this we are trying to “block” the minimal number of reflex vertices 
from being connected (and eliminated) by another 2-reflex eliminator diagonal. 

Experimental results are shown in Figure 5. These latter improvements to 
the AB decomposition seem to have the largest effect on the union running 
time, while keeping the decomposition method very simple to understand and 
implement. Note that the small side AB heuristic results in 20% faster union 
time than the improved AB and reflex AB decompositions, and 50% faster than 
the standard angle bisector method. 
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Abstract. An instance of Hypergraph Max fc-Cut with given sizes of 
parts (or Hyp Max fc-CuT with gsp) consists of a hypergraph H = 
{V, E), nonnegative weights ws defined on its edges S £ E, and k positive 
integers pi, ■ ■ ■ ,Pk such that = |U|. It is required to partition 

the vertex set V into k parts Xi,... ,Xk, with each part Xi having 
size Pi, so as to maximize the total weight of edges not lying entirely 
in any part of the partition. The version of the problem in which \Xi\ 
may be arbitrary is known to be approximable within a factor of 0.72 
of the optimum (Andersson and Engebretsen, 1998). The authors (1999) 
designed an 0.5-approximation for the special case when the hypergraph 
is a graph. The main result of this paper is that Hyp Max fc-CuT with 
GSP can be approximated within a factor of min{A|s| : S £ E} of the 
optimum, where Ar = 1 — (1 — l/r)’’ — (1/r)’’. 



1 Introduction 

The last decade has been marked by striking breakthroughs in designing ap- 
proximation algorithms with provable performance guarantees. Most of them 
to all appearances are due to using novel methods of rounding polynomially 
solvable fractional relaxations. Applicability of a rounding method is highly de- 
pendent on the type of constraints in the relaxation. In [1] the authors presented 
a new rounding technique (the pipage rounding) especially oriented to tackle 
some NP-hard problems, which can be formulated as integer programs with 
assignment-type constraints. The paper [1] contains four approximation results 
demonstrating the efficiency of this technique. One of the problems treated in [1] 
is Max fc-Cut with given sizes of parts (or Max fc-CuT with GSP). An instance 
of this problem consists of an undirected graph G = (V,E), nonnegative edge 
weights We, e £ E, and k positive integers pi,p 2 , ■ ■ ■ ,Pk such that J2i=i Pi ~ 1^1- 
It is required to find a partition of V into k parts Vi , V 2 , ■ • ■ , with each part 
Vi having size pi, so as to maximize the total weight of edges whose ends lie 
in different parts of the partition. The paper [1] gives an 0.5-approximation for 
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this problem. Very recently, Feige and Langberg [7] by a combination of the 
method of [1] with the semidefinite programming technique designed an 0.5 + e- 
approximation for Max 2- Cut with GSP, where e is some unspecified small 
positive number. 

The Max Cut and Max fc-CuT problems are classical in combinatorial 
optimization and have been extensively studied in the absence of any restrictions 
on the sizes of parts. The best known approximation algorithm for Max Cut is 
due to Goemans and Williamson [9] and has a performance guarantee of 0.878. 
Frieze and Jerrum [8] extended the technique of Goemans and Williamson to 
Max fc-CuT and designed a (1 — l/fc + 21nfc/fc^)-approximation algorithm. On 
the other hand, as it was shown by Kami et al. [10], no approximation algorithm 
for Max fc-CUT can have a performance guarantee better than 1 — l/34fc, unless 
P=NP. 

Approximation results for some special cases of Max fc-CuT with GSP 
have been also established. In particular. Frieze and Jerrum [8] present an 0.65- 
approximation algorithm for Max Bisection (the special case of Max fc-CuT 
WITH GSP where k = 2 andpi = P 2 = |I^|/2). Very recently. Ye [11] announced an 
algorithm with a better performance guarantee of 0.699. The best known approx- 
imation algorithm for Max fc-SECTiON (the case where pi = ■ ■ ■ = Pk = \V\/k) 
is due to Andersson [2] and has a performance guarantee of 1 — l/fc-|- Sl{l/k^). 

In this paper we consider a hypergraph generalization of Max fc-CUT with 
GSP — Hypergraph Max fc-Gut with given sizes of parts or, for short. Hyp Max 
/ c-CuT WITH GSP. An instance of Hyp Max k-CvT with gsp consists of a 
hypergraph H = (V,E), nonnegative weights ws on its edges S, and k positive 
integers Pi , ... ^pk such that J2i=iPi ~ jl^j- It is required to partition the vertex 
set V into k parts Ai, A 2 , . . . ,Xk, with each part Xi having size pi, so as to 
maximize the total weight of the edges of H, not lying entirely in any part of 
the partition (i.e., to maximize the total weight of S G E satisfying S % Xi for 
all i). 

Several closely related versions of Hyp Max fc-CuT with gsp were studied 
in the literature but few results have been obtained. Andersson and Engebretsen 
[3] presented an 0.72-approximation algorithm for the ordinary Hyp Max Cut 
problem (i.e., for the version without any restrictions on the sizes of parts). 
Arora, Karger, and Karpinski [4] designed a PTAS for dense instances of this 
problem or, more precisely, for the case when the hypergraph El is restricted 
to have 6>(|V|‘^) edges, under the assumption that [S'] < d for each edge S and 
some constant d. 

In this paper, by applying the pipage rounding method, we prove that Hyp 
Max /c-Cut with gsp can be approximated within a factor of min{A| 5 | : S € E} 
of the optimum, where Ar = 1 — (1 — f/rY — (1/r)’’. By direct calculations it 
easy to get some specific values of A^: A 2 = 1/2 = 0.5, A 3 = 2/3 ~ 0.666, 
A 4 = 87/128 « 0.679, A 5 = 84/125 = 0.672, Ae « 0.665 and so on. It is clear 
that Xr tends to 1 — « 0.632 as r — > 00 . A bit less trivial fact is that 

Ar > 1 — for each r > 3 (Lemma 2 in this paper). Summing up we arrive 
at the following conclusion: our algorithm finds a feasible cut of weight within 
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a factor of 0.5 on general hypergraphs, i.e., in the case when each edge of the 
hypergraph has size at least 2, and within a factor of 1 — « 0.632 in the case 

when each edge has size at least 3. Note that the first bound coincides with that 
obtained in [1] for the case of graphs. In this paper we also show that in the case 
of hypergraphs with each edge of size at least 3 the bound of 1 — cannot be 
improved, unless P=NP. 



2 The Pipage Rounding: A General Scheme 

We begin with a description of the pipage rounding [1] for the case of a slightly 
more general constraints. 

Assume that a problem P can be reformulated as the following nonlinear 
binary program: 



max F{xii, . . . ,Xnk) 






(1) 


S. t. '^Xit =pt, t=l,. 


..,fc. 




(2) 


k 

'^Xit = l, i = l,.. 


. ,n. 




(3) 










Xit G {0, 1}, t = 1, . . 


,.,k, 1 = 1,.. 


. ,n. 


(4) 



where pi,p 2 , ■ ■ ■ ,Pk are positive integers such that ~ is a function 

defined on the rational points x = (xu) of the n x /c-dimensional cube [0, 1]”^^ 
and computable in polynomial time. Assume further that one can associate with 
F{x) another function L[x) that is defined and polynomially computable on the 
same set, coincides with F{x) on binary x satisfying (2)-(3), and such that the 
program 



max L{x) 






(5) 


n 

S. t. '^Xit=Pt, t=l,. 


..,fc. 




(6) 


i=l 








k 








'^Xit = l, i=l,.. 


. ,n. 




(7) 












, .,fc, 1 = 1,.. 


.,n 


(8) 



(henceforth called the nice relaxation) is polynomially solvable. Assume next that 
the following two main conditions hold. The first — F/L lower bound eondition 
— states: there exists C > 0 such that F{x)/L{x) > C for each x S [0,1]"^^. 
To formulate the second — e-convexity condition — we need a description of the 
so-called pipage step. 

Let a: be a feasible solution to (5)-(8). Define the bipartite graph H with 
the bipartition ({1, . . . , n}, {!,... , fc}) so that jt G E{H) if and only if Xjt is 
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non-integral. Note that (6) and (7) imply that each vertex of H is either isolated 
or has degree at least 2. Assume that x has fractional components. Since H is 
bipartite it follows that H has a cycle D of even length. Let M\ and M 2 be the 
matchings of H whose union is the cycle D. Define a new solution x{e) by the 
following rule: if jt is not an edge of D, then Xjt{e) coincides with xjt, otherwise, 
Xjt{s) = Xjt -I- e if jt € Ml, and Xjt{e) = Xjt — e if jt S M2. 

By definition x{e) is a feasible solution to the linear relaxation of (5)-(8) for 
all e € [—£1,62] where 



£i = min{ min xa, min (1 — 
jteMi jteM2 



and 

£2 = mini min (1 — Xjt), min X4t\. 
jteMi jteAh 

The e-convexity condition states that for every feasible x and every cycle D 
in the graph H, ip(e) = F{x{e)) is a convex function on the above interval. 

Under the above assumptions we claim that there exists a polynomial-time C- 
approximation algorithm for solving P. Indeed, since the function Lp{e) = F(x{e)) 
is convex, 

F{x{e*)) > F{x) > CL{x) 

for some £* G {—£1, £2}- The new solution x{e*), being feasible for (5)-(8), has a 
smaller number of fractional components. Set x' = x{e*) and, if x' has fractional 
components, apply to x' the above described pipage step and so on. Ultimately, 
after at most nk steps, we arrive at a solution x which is feasible for (l)-(4) and 
satisfies 

F(5) > CL{x) > CF* 

where F* is an optimal value of (l)-(4) (and of the original problem P). The 
rounding procedure described (and henceforth called the pipage rounding) can 
be clearly implemented in polynomial time. 

Thus we obtain a C-approximation algorithm for P. It consists of two phases: 
the first phase is to find a feasible (fractional) solution to (5)-(8), and the second 
is to round off this solution by using the pipage rounding. 

3 The Pipage Rounding: An Application to the Problem 

It is easy to see that an instance of Hyp Max fc-Cux with GSP can be equiva- 
lently formulated as the following (nonlinear) integer program: 

k 

max F{x) = E ws{l - En Xit) (9) 

seE t=i ies 

k 

s. t. Xit = 1 for all i, 

t=i 



( 10 ) 
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n 



Xit = Pt for all t, 

i—1 


(11) 


Xit G {0, 1} for all i and t. 


(12) 


The equivalence is shown by the one-to-one correspondence between optimal 
solutions to the above program and optimal fc-cuts {Xi,... ^Xk} of instance 
of Hyp Max fc-Cux with GSP defined by the relation “xit = 1 if and only if 


ieXtF 




As a nice relaxation we consider the following linear program: 




max wszs 


(13) 


seE 




s. t. zs < -S' — yy Xit for all S G E, 


(14) 


ieS 

k 




yy Xit = 1 for all i, 

t=i 


(15) 


n 

yy Xit = Pt for all t, 
2 = 1 


(16) 


0 < Xit < 1 for all i and t, 


(17) 


0 < zs < 1 for all S G E. 


(18) 



It is easy to see that, given a feasible matrix x, the optimal values of zs in 
the above program can be uniquely determined by simple formulas. Using this 
observation we can exclude the variables zs and rewrite (13)-(18) in the following 
equivalent way: 

max A(a:) = res min{l, minds'! — Xit)} (19) 

S&E ies 

subject to (15)-(17). Note that F{x) = L{x) for each x satisfying (10)-(12). 

We claim that, for every feasible x and every cycle D in the graph H (for 
definitions, see Section 2), the function ‘^{e) = F{x{e)) is a quadratic polyno- 
mial with a nonnegative leading coefficient. Indeed, observe that each product 
contains at most two modified variables. Assume that a product 
riigs 2 :it(£:) contains exactly two such variables a:qt(e) and Xi^tis)- Then they 
can have only one of the following forms: either ccqt -I- e and Xi^t — e or Xi^t — £ 
and Xi^t + £, respectively. In either case has a nonnegative coefficient in the 
term corresponding to the product. This proves that the e-convexity condition 
does hold. 

For any r > 1, set A^. = 1 — (1 — l/r)’’ — (1/r)’'. 

Lemma 1. Let x = (xu) be a feasible solution to (19),(15)-(17) and S G E. 
Then 

k 

(i-En Xit) > A|S| min{l,inin(|S'| - xu)}. 

t=i ies ies 
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Proof. Let zs = min{l, mintd^l — xu)}. Define qs and t' by the equalities 



Note that 



qs = max^ Xit = ^ xn' ■ 
ies ies 



zs = min{l, l^l - qs}. 



( 20 ) 



Using the arithmetic-geometric mean inequality and the fact that 

= 1^1 

t=i ies 



we obtain that 

k 

i^S iGS iGS 



> 1 - 


1 |5| . 


)'"-i:(^‘S 




> 1 - 


fqs\\S\ 


( "^ies \ 


|S| 




1 1^1 ) 




- 1 - 




/ “S' — J2ies \ 


|S| 






\ |5| ) 




- 1 - 


/qs\\S\ 


U qs\l^l 








1 1^1 ^ • 





( 21 ) 



/ \UI / \UI 
Let fj{y) = 1 - (l - . 

Case 1. — 1 < qs < |S'|. Then by (20), zs = |S'| — qs, and hence by (21), 

'-i:n-si-(i-i)'"-(g)'"=*fe)- 

i=l zGS ' ' ' ' 

Since the function ip is concave and V'(O) = 0; '0(1) = -^ISI) it follows that 

k 

1-En ^ -^ISUS- 

t=l iGS 

Case 2. 1 < qs < |>S'| — 1- Here zs = 1- Since ip{y) is concave and ■0(1) = 
iPi\S\-l) = \si, 

k 

1-EIl'*'** - ^ui- 

t=i ies 
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Case 3. 0 < gs < 1. Again, zs = 1- For every t, set fit = Note that, 

by the assumption of the case. 



0 < /xt < 1, 



( 22 ) 



and, moreover. 






(23) 



By the arithmetic-geometric mean inequality it follows that 

k k I o I 

sn-“^i:(i) 

i^S ' ' 

k 

(by (22)) 






Consequently, 



(by (23)) =|5|-l^l|5|. 



1 \IS| 






i=l zGS 



1^1' 

1 \|S| 



1 \I-SI 






,| 5 | 

1 \is| 






y\isi 
1^, 



- A|s|- 



Corollary 1. Let x = (xu) be a feasible solution to (19),(15)-(17). Then 



F{x) > (imnA|s|)L(a;). 

oG-tj 



The corollary states that the F/ L lower bound condition holds with 

C = nnnA|s|. 

Hence the pipage rounding provides an algorithm that finds a feasible fc-cut 
whose weight is within a factor of minsg^ A| 5 | of the optimum. 

Note that A 2 = 1/2. We now establish a lower bound on Ar for all r > 3. 

Lemma 2. For any r > 3, 



Xr > I — e 
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Proof. We first deduce it from the following stronger inequality: 
(l-i) forallr>l. 



(24) 



Indeed, for any r > 3, 



Is r 






1/e-i 



= 1 — e’ 



-1 _l_ _ f- 'j 

r\ 2 j-r-i J 



To prove (24), by taking natural logarithm of both sides of (24) rewrite it in the 
following equivalent form: 

1 + rln(l ) < ln(l ) for all r > 1. 

r 2r 

Using the Taylor series expansion 

OO 2 

ln(l -a) = -J2j 

we obtain that for each r = 1, 2, . . . , 

l + "ln(l-i) = l + r(-i-^-^-...) 

_ 1 1 1 

2r 3r^ 4r^ 

<11 1 



2r 2(2r)2 3(2r)3 ‘ " 



= ln(l - 



as required. □ 

We now show that in the case of r-uniform hypergraphs the integrality gap 
for the relaxation (13)-(18) can be arbitrarily close to A^-. It follows that no other 
rounding of this relaxation can provide an algorithm with a better performance 
guarantee. 

Indeed, consider the following instance: the complete r-uniform hypergraph 
on n = rq vertices, k = 2, ws = 1 for all S G E, pi = q and p 2 = n — q. It is 
clear that any feasible cut in this hypergraph has weight 
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Consider the feasible solution to (15)-(18) in which 

Xu = \jr and xa = 1 — 1/r for each i. 



The weight of this solution is equal to since for each edge S we have 

r - ^ Xii > r - ^ = 1 

ieS ieS 



and therefore zs = I for all S G E. Thus the integrality gap for this instance is 
at most 



g!(n — r)! 


(n — g)!(n — r)! 


(g — ry.nl 


(n — q — r)\n\ 


?! 


1 


(g — r)!n'’ 


{n — q — r)!n’’ 


1 

1 


{n — q — rY 




which tends to as g ^ oo. 

We conclude the paper with a proof that the performance bound of 1 — e“^, 
our algorithm provides on hypergraphs with each edge of size at least 3, cannot 
be improved, unless P = NP. 

In the Maximum Coverage problem (Maximum Coverage for short), given 
a family P = {Sj ■. j G J} of subsets of a set / = {!,... ,n} with associated 
nonnegative weights Wj and a positive integer p, it is required to find a subset 
X C I (called coverage ) with |X| = p so as to maximize the total weight of 
the sets in T having nonempty intersections with X. It is well known that a 
simple greedy algorithm solves Maximum Coverage approximately within a 
factor of 1 — e~^ of the optimum (Cornuejols, Fisher and Nemhauser [5]). Feige 
[6] proved that no polynomial algorithm can have better performance guarantee, 
unless P=NP. 

Our proof consists in constructing an approximation preserving reduction 
from Maximum Coverage to Hyp Max fc-Cux with gsp. Let a set /, a 
collection , S'™ C I, nonnegative weights (tCj), and a positive number 

p form an instance A of Maximum Coverage. Construct an instance B of 
Hyp Max /c-Cut with gsp as follows: 7' = J U {ui, . . . ,Um} (assuming that 
/n{ui, . . . ,Um} = 0 ), {S[ = SiUjui}, ... ,S!^ = {«„}), the same weights 

Wj, and Pi = p, p 2 = |7'| — p. Let {X,I' \ X) be a maximum weight cut in 
B with the sizes of parts pi and p 2 - It is clear that its weight is at least the 
weight of a maximum coverage in A. Thus it remains to transform {X,P \ X) 
into a coverage of A with the same weight. If X C 7, we are done. Assume that 
X contains Uj for some j. Then successively, for each such j, replace Uj in X 
by an arbitrary element in Sj that is not a member of X, or if Sj C X, by an 
arbitrary element of 7 that is not a member of X. After this transformation and 
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after possibly including a few more elements from / to get exactly p elements, 
we arrive at a coverage Y C / in A whose weight is at least the weight of the 
cut {X, I' \ X) in B, as desired. 
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Abstract. In the offline list update problem, we maintain an unsorted linear list 
used as a dictionary. Accessing the item at position i in the list costs i units. In 
order to reduce access cost, we are allowed to update the list at any time by trans- 
posing consecutive items at a cost of one unit. Given a sequence a of requests 
one has to serve in turn, we are interested in the minimal cost needed to serve 
all requests. Little is known about this problem. The best algorithm so far needs 
exponential time in the number of items in the list. We show that there is no poly- 
nomial algorithm unless P = NP. 

Keywords: On-line algorithms, competitive analysis, list-update, NP. 



1 Introduction 

The list update problem is a classical online problem in the area of self-organizing data 
structures [16,4,8]. In this paper, we will be concerned with the offline version of that 
problem (OLUP). In both versions, requests to items in an unsorted linear list must be 
served by accessing the requested item. If the item is at position i in the list, the access 
incurs a cost of i units in the full cost model and i — 1 units in the partial cost model 
[10,1 1]. The goal is to keep access cost small by rearranging the items in the list. We 
have so-called free exchanges (the requested item can be moved closer to the list at no 
charge just after the request) and paid exchanges (any two consecutive items can be 
transposed at cost one). 

An online algorithm must serve the sequence a of requests one item at a time, without 
seeing future requests. An offline algorithm knows the entire sequence cr in advance. A 
feasible (but not necessarily optimal) solution for an instance is called a schedule. 

To measure the performance of an online algorithm A on some sequence cr, we compare 
its overall cost A(cr) with the cost OPT {a) incurred by an optimal offline algorithm, and 
we call A c-competitive, if for a suitable b, 

A{a) <c-OPT{a) + b, (1) 

for all sequences a. If A is randomized, A{a) denotes the expected cost incurred by 
serving a. 
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It is known that the best randomized online algorithm satisfies 1.5 < c < 1.6 in the full 
cost model [2,17]. For the partial cost model, a lower bound of c > 1.50084 can be 
shown [5]. 

Because of (1), the analysis of an online algorithm requires some understanding of the 
optimal offline algorithm OPT. Most algorithms analyzed so far have a property called 
projectivity [7]. For a projective algorithm A, a proof that A is c-competitive on lists 
with only two items generalizes automatically to longer lists. Fortunately on two items, 
OPT is fully understood. Recently it was shown that the upper bound of 1.6 cannot be 
further improved by projective algorithms [6]. This means that understanding OPT has 
become an even more important issue. 

A better understanding of OPT could be helpful in two respects for online list up- 
date. First and most obviously, it may lead to better lower bounds on OPT {a) to plug 
into (1); second, it might reveal structural properties that turn out to be useful in de- 
vising better online algorithms. Properties of OPT have already been studied in the 
past [14,15,2,3]. 

The size of an OLUP instance on n items and m requests is 6>(log(n) • m). But we can 
assume m > n. Therefore, an algorithm is still polynomial if its runtime is polynomial 
in n. 

The currently best algorithm for OLUP runs in 0(2 "n!m) [ 14]. It is based on a straight- 
forward dynamic programming algorithm for metrical task systems [ 13] which works 
as follows. 

Let d{i,L) be the minimal cost needed to serve the first i requests of the request se- 
quence cr and end up in the list state L. We define d{0,L) = 0 if L is the initial list state 
and d(0, L) = oo for all other L. We can compute d{i, L) using 

d{i, L) = rnin(d(* — 1, L') + trans{L' , L) + acc(i^ L)) . (2) 

Here, trans{L\ L) denotes the minimal cost to move from state L' to L and acc{i, L) 
denotes the cost for accessing ai in L. 

Note that there is always an optimal algorithm which does no free exchanges, therefore 
trans{L, L') is easy to compute. The time needed to compute all d{i, L) is 0((n!) ^m). 

This runtime can be reduced to 0(2 "n!m) by using the fact that there is an optimal algo- 
rithm which uses only so-called subset transfers [ 14]. In a subset transfer, one moves a 
subset of the items preceding the requested item x just behind x without changing their 
relative orders. Only 0(2") among the n! possible transformations are subset transfers. 

In this paper, we show that it is NP-hard to decide whether a sequence cr can be served 
with cost less or equal k. Therefore, there cannot be a polynomial time algorithm in n 
and m unless P = NP. This solves a longstanding open question ([8], page 22). One 
interpretation of this result is that the structure of OPT is not easy to understand; con- 
sequently, it might be very difficult to develop (and analyze) non-projective list update 
algorithms that improve the current upper bound of 1 .6. 

As an important part of the proof, we introduce a generalization of OLUP called weigh- 
ted list update problem iWLUP). Here the items have a weight that influences access and 
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transposition cost. A version of WLUP was considered already in [ 9] , but our definitions 
and applications are different. 

Throughout this paper, we assume the partial cost model. This is simpler to analyze 
than the original full cost model. It is easy to obtain the value of OPT [a) in the full cost 
model by adding |(t| to the optimal cost in the partial cost model. Therefore, the proof 
certainly holds for the full cost model as well. 

2 The Weighted List Update Problem 

In this section, we generalize offline list update to items with weights. These weights 
have to be positive integers. We denote weighted items by capital letters. 

An instance {a, L, W) of {WLUP) consists of a request sequence a and an initial list L 
over a set of weighted items Xi with weights Wi. Let W be the vector whose ith entry 
Wi is the weight of item Xi. 

The cost incurred by operating on weighted items are the following. In order to trans- 
pose two items Xi and Xj with weights Wi and wj respectively, we pay Wi ■ Wj units. 
The access cost for an item Xi with weight Wi are the following. Let S be the set of 
items in front of Xi, then accessing Xi costs 



This setting is motivated by the following interpretation of WLUP in terms of OLUP. 
Think of a weighted item Xi as a vector of items [xi^i . . . Xi^^i]- Accessing Xi means 
to access every item in Xi’s vector in turn. In order to access Xij, one has to pass all 
xi^k with Xi G S plus all Xi^k with k < j. Summing up over all items in Xi, we get (3). 
If two weighted items Xi and Xj are transposed, every item in X^’s vector has to pass 
every item in Xj’s vector. In total, we need Wi ■ Wj transpositions. 

The next lemma shows that the optimal schedules for the WLUP instance and its inter- 
pretation as an OLUP instance have the same cost. The lemma also serves as a proof 
for the reduction 



Note that the reduction is polynomial only for instances where the weights of the items 
are polynomial in the number of items or requests. 

Lemma 1. Let {a, L, W) be an instance of WLUP and let {<j' , L') be the transforma- 
tion of (a, L, W) to OLUP by replacing the items of (cr, L) by vectors of items. Then 
the costs of their optimal schedules are the same. 

Proof. Let us assume that the items of (cr, L, W) are denoted by Xi, and let Wi be the 
weight of Xi. In order to obtain (cr', L'), we replace the Xi by Xi^i . . . Xi^^t in L and cr. 
As an example, cr = X 2 X 2 X 1 , L = [XiX 2 Xf\, and W = [3, 2, 2] transforms to 

cr' = X2,lX2,2 X2,lX2,2 Xi^iXi^2Xu3 and L' = [xi^iXi^ 2 Xl ,3 X2, 1X2,2 X 3 ,iX 3 , 2 ]. 




( 3 ) 



WLUP <p OLUP. 



( 4 ) 



Offline List Update is NP-hard 



45 



Clearly, the optimal cost of (ct', L') are not larger than those of (cr, L, W), because we 
can serve (cr', L') according to the interpretation given above. It remains to show the 
the optimal cost for (cr, L, W) are not larger than those of (cr', L'). 

From (cr', L'), we can produce an instance of WLUP called (cr", L" , 1) which has the 
same optimal value as (cr', L'): In a first step, we replace the Xij by weighted items 
Yij with weight Wij . Note that for the moment, we do not set values to the Wij . Let us 
refer to this instance by (cr", L", W”). In order to obtain (cr", L", 1), we further assign 
all weights to Wij = 1. An optimal schedule for {a',L') is clearly also optimal for 
(cr", L", 1). (The cost for a pair of items with weight one in WLUP are the same as for 
a pair of items in OLUP.) Therefore, the cost of an optimal schedule is the same for 
both (cr", L", 1) and (ct', L'). 

We can apply the optimal schedule for (cr", L", 1) also to (cr", L", W”) with any set- 
ting of W . Note that the schedule remains the same by changing the weights. Only 
the cost for serving the schedule change. The cost for serving this schedule can be 
expressed as a function in the Wij . 

In order to do so, we dehne constants for each pair {Yij,Ykj}. In 

we sum up the number of times Yij and Ykj are transposed in the schedule plus the 
number of times Yij is in front of Yk^i when Ykj is requested and vice versa. So these 
constants are independent of Wij and Wk,i- The total cost of the optimal schedule for 
(cr", L", 1) applied to (cr", L", W") then is 

= I] ■ Wi,jWk,i+J2 (^2^) ■ 



Here \ay. \ denotes the number or requests to Yij in cr". The second sum comes from 
the binomial coefficients in (3). The second term of (3) is part of the f(ij)^{k,i) ■ 

For any choice of the Wij, equation (5) upper bounds the cost of the optimal schedule 
for (cr", L” , LF"). This holds because (5) denotes the cost of a legal (but not necessary 
optimal) schedule for (cr", L", FF"). 

In order to prove (cr", L", 1) > (cr, L, IF), we proceed as follows. Starting with the 
initial values Wij = 1 for all i and j, we will change the values of the Wij step by step 
in such a way that the value of (5) does not increase and we will end up with an instance 
which is equivalent to (cr, L, IF). 

This is how we change the weights. As long as there is an i with Wij > 0 and Wi^k > 0 
for some i ^ k,we set one of them to the sum of the two. The other one is set to zero. 
Note that an item with weight zero incurs neither access nor transposition cost. 

Let us write (5) such that we can more easily detect how its value changes if we change 

Wij and Wi^k- 



C — Cq F C'l * F C2 * yJijk F f (^i^j)^(^i^k) ‘ 



F 





// 

Yi,k 



(6) 
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Here, Cq denotes all costs independent of both Wij and Wi^k- By Ci ■ Wij and C2 ■ Wi^k, 
we denote cost depending linearly only on one of the two. 

Assume w.l.o.g. Ci < C2. If we set the new value of Wij to Wij + Wi^k and set Wi^k to 
zero, the value of ( 5 ) does not increase. This holds because Cq does not change and 

+ Wi^k) + C 2 • 0 < Cl ■ Wij + C 2 • Wi^k, (7) 



and furthermore 



f ‘ WijWi^k “t” 
/(j,i),(i.fe) • {Wi,j + Wz.fc) -0 + 






Wij + WiA I „ 



Wi^k 

2 



K,J> 






( 8 ) 



Using lay. . \ = \(Jy. ^ \ and fjj)ji^k) > Wvi ^ l> inequality (8) is straightforward. 

This reweighting process must terminate because in each step, the number of Wij set 
to zero increases by one. What we end up with is an instance where for each i we 
have exactly one Wij with Wij = Wi, and all the other Wij are zero. This instance is 
equivalent to {a, L, W) we started with. Just rename the Yij with Wij = Wi to Xi and 
forget about the Yij which have weight zero. Because we did not increase the value of 
( 5 ) hy changing the weights, we have a schedule for (tr, L, W) which has the same cost 
as the optimal schedule for (cr', L'). D 



3 A Lower Bound 

A total ordering of n items in a list L is defined by the relative ordering of all (2) pairs 
of items in L. For example, the list state L' = [X2X3X1] corresponds uniquely to the 
set 

{X2<XyX3<XyX2<X3}. 

Instead of one list containing n items, we can represent the list state by two-item lists 
LxiXj , for each pair of items in L. The state formed by these (2) pair lists is legal if and 
only if we can combine the relative orders induced by the pair lists to a total ordering. 
Thus, the state 



LxiX2 = [X1X2], LX2X3 = 1X2X3], LxiXs = 

is not legal. But if we changed the ordering of the items in L X2X3, it would correspond 
to [A3X1X2]. 

We will serve our requests on the pair lists. In order to satisfy a request to an item Xi, 
we must access this item in all pair list it occurs. There are n — 1 of them. Hence, on a 
pair list Lx^x , we serve only those requests in a that go to or Xj. We denote this 
subsequence by crXiXj and call it the projection of a to Xi and Xj . At any point in time, 
we are allowed to transpose the items in a pair list, paying Wi ■ Wj units. But we have 



Offline List Update is NP-hard 



47 



to make sure that this leads to a legal state. If we access Xj in the list [XiXj], we pay 
Wi ■ Wj units, hut in [XjXi] we pay noting. We call these access and transposition cost 
pair costs, hecanse we can assign them uniquely to a pair if items, namely {Xi,Xj}. 
In addition to the access cost we already mentioned, we pay b units for each request, 
where b = ( 2 ) if ths requested item has weight w. We refer to these cost as item cost. 
Note that the item cost depend only on the request sequence, but not on the schedule. It 
can be easily seen from the dehnitions of the two models that they are equivalent. 

From the pair list based model, we can derive a lower bound for WOPT, to which we 
will refer to as WOPT [2]. In order to do so, we forget about the condition that the state 
defined by the pair lists must correspond to a total ordering of all n items. So in WOPT, 
any state of the pair lists is legal. This means that for each pair of items {Xi,Xj}, 
we can choose their relative order independently of the other pairs. In order to get the 
optimal schedule for the pair {Xi,Xj}, we just serve aXi,Xj on Lxi,Xj optimally. 
Thus, to derive WOPT, we just have to solve ( 2 ) WLUP instances on two items each. 
Fortunately, the algorithm described in the introduction runs in linear time if the number 
of items is fixed. Thus, WOPT is computable in polynomial time. 

Note as well that the optimal schedule for an instance on two items does not depend 
on the weights, at least if both weights are strictly positive. This holds because of two 
reasons. First, the item cost are constant for every schedule of a given request sequence. 
Second, the pair cost are of the form k ■ {wi ■ Wj), where k does not depend on the 
weights. 

Thus, the hardness of the problem comes from the total ordering that must hold at any 
time in the schedule of WOPT. Most of the time, the actual state WOPT is in will not 
be a legal one for WOPT. Therefore, some pairs will have to be served non-optimally 
in WOPT. 

4 The Reduction 

Lemma 1 allows us to show a reduction from an NP-complete problem to WLUP in 
order to prove NP-hardness of OLUP. 

The minimum feedback arc set problem (MINFAS) [ 1 2] will serve well for that pur- 
pose. Its decision variant MINFAS{G, k) is defined as follows. Given a directed graph 
G = {V, E) and fc € N, we want to decide whether there is a subset E' C E with 
\E'\ < k such that the Graph G' = {V,E — E') is acyclic. 

There is a second interpretation which is more natural for our purpose. We interpret an 
arc pointing from Vi to Vj as a constraint “vi should be ahead of vf’. What we want 
to decide is whether there exists a total ordering of the vertices such that less than k 
constraints are unsatisfied. 

We show a reduction from MINFAS{G, k) to the decision version of WLUP, denoted by 
WLUP{a, L, W, k'). Here we want to decide whether there is a schedule which serves 
(T from the initial state L at maximal cost k' . More precisely, we give a function / that 
takes G and k as arguments and returns {a, L,W) such that 



MINFAS{G, k) ^ WLUP{a, L, W, WOPT(a, L, W) + k). 



(9) 
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From now on, we abbreviate the cost WOPT{a, L, W) by wopt. For the following sec- 
tion, it is important to understand how WOPT behaves in different situations. As WOPT 
treats all pairs of items independently, we have to investigate how sequences on two 
items are served optimally. Remember that in the two items case, the behavior does not 
depend on the weights, therefore it behaves like OPT. 

We consider a list containing only the items a and b. In order to describe WOPT, we 
must hnd out in which cases one must, can or must not swap the two items. The follow- 
ing table gives the answer for a few cases, depending on how the remaining part of the 
request sequence looks like. We will encounter these cases later in this section and then 
refer to them by their number in (•). The notation is analogous to the one for regular 
expressions. Thus, (6a)* denotes the empty sequence or any number of repetitions of 
6a. The sequence A can be any sequence on a and 6. If there is no A at the end of the 
sequence, we assume that this is the end of the sequence. 



must not swap: 


(0) 

(1) 

(2) 


[ab] aX 
[ab] baaX 
[ab] (6a)* 


can swap: 


( 3 ) 

( 4 ) 


[ab] babbX 
[ab] {ba)*b 


must swap: 


( 5 ) 


[ab] 666A 



We now describe the function / which transforms a MINFAS instance into a WLUP 
instance. For every vertex of G = (V, E), we have a weighted item Vi with weight 
k + 1 . We call them vertex items and define n := \ V\. Additionally, we have two items 
c and d both with weight one. 

These are all the items we need. Let us check briefly that the weights are not too large 
in order to make ( 4 ) work. Clearly, the hard MINFAS instances obey k<\E\< |C| 2 . 
Hence in those cases, the weights of the items are polynomial in the number of items. 
Thus, the reduction from WLUP to OLUP is polynomial. 

We set L = [C1V2V3 . . . lA„C(i]. The sequence a is basically of the form 

{ViV 2 V^ . . .VnY , 

with additional requests to c and d. It consists of two parts a' and a” . The first part is 

a' = V1V2V3 ...Vn. 

The second part consists of so called arc gadgets. An arc gadget 7 for (vi,Vj) G E 
looks as follows. Basically, it consists of 6 repetitions of tr', with additional requests to 
c and d. If i < j the gadget looks as follows. 

7 = Vi...Vi-icVi...VjdVj+i...ViCcVi+i...Vj-idddcccVj...Vna''a'a'a' (10) 

For j < i, we have 

7 = Vi...Vj...V-icVi...VjdVj+i...ViCcV+i...Vj-idddcccVj...Y...Vna'a'a' ( 11 ) 
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The idea of 7 is to make WOPT pay one unit extra if and only if Vj is behind Vi at the 
first request to d in the gadget. 

Lets partition the set of arcs in G into two subsets. contains the arcs Vj) with 
i > j, whereas E~ contains those with i < j. In a”, we have one arc gadget for each 
arc in G, with the additional restriction that all the arc gadgets of the arcs in £1 + precede 
those in E~ . 

In order to prove (9), we first look at some properties of this instance. We have seen in 
section 3 that the cost spent by WOPT on a pair of items lower bounds the cost WOPT 
needs for this pair. Therefore, in a schedule that costs no more than wopt+ k units every 
pair of items involving a vertex item must be served according to WOPT. This holds 
because the pair cost of such a pair are a multiple of (/c + 1). Therefore, any non-optimal 
step involving a vertex item costs at least fc + 1 additional units. 

Let us first look at pairs consisting of two vertex items Vi and Vj,i < j. In the initial 
state, Vi is in front of Vj . Therefore, WOPT has to serve the following request sequence 
from initial state 



VVjVVj...Wj ( 12 ) 

One way of optimally serving this instance is to do nothing at all. But there are even 
more optimal schedules for this sequence: In order to stay optimal, we are allowed to 
swap the two items exactly once (check (0), (2), and (4)). Because one should never 
move Vj in front of Vi when the next request goes to Vi (0) , this swap has to take place 
before a request to Vj . 

It is easy to see that in an optimal schedule which pays less than wopt + k units, the list 
state before and after every gadget must have the sublist [cd] at the end of the list state: 
Because of the three repetitions of a' at the end of the edge gadgets and because of (5), 
the items c and d must surely be at the end of the list. If we now have a closer look at 
the requests to c and d only, we notice that a gadget ends up with three requests to c and 
starts with another one to c. Therefore, in order to be optimal, c has to be in front of d. 

To see how WOPT serves the gadget for (vi,Vj), we have to look at two cases. If Vi is 
in front of Vj, we can serve it with the same amount WOPT would pay. This is easy to 
check by showing that there is a schedule which serves each pair of items like WOPT 
would do: The crucial point is that in WOPT, when the first request to d takes place, 
d must still be behind c ( 1 ), while c must be behind Vi ( 1 ) and d must be in front of 
Vj (5). Only if Vi is in front of Vj, we can fulfill these conditions. Note that c and d 
can pass 14, A: ^ {u j}> but they do not have to (3) . At the next request to c, we move 
c to the front of the list (3, 5). Later, at the first request of the request triple to d, we 
move d to the front as well (5) , but c will have to pass d again later (5) . Because of the 
additional finishing 7 , both c and d must be moved behind all vertex items at the 
end without changing the relative order of c and d (5) . 

If Vi is behind Vj , not all the conditions mentioned in the previous paragraph can be 
fulfilled at the first request to d. The only way to fix this without paying more than 
k extra units is to move d in front of c at the first request to d and thus pay one unit 
more than WOPT. 
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Now we are ready to prove (9). The easier part is the direction. If we can get an 
acyclic graph G' by removing only k arcs, we sort the vertices of G' topologically. The 
schedule which costs at most wopt + k looks as follows. We use the initial sequence a' 
to rearrange the items Vi according to the topological order (4) . For the rest of a, we 
do not change the ordering of the vertex items anymore. Thus, we serve all vertex pairs 
optimally. 

Concerning the arc gadgets, we can serve all those corresponding to the arcs in G ' like 
WOPT would do it. For each arc we removed from G, we have to pay one unit extra. 
As there are at most k of them, we pay at most wopt + k units to serve a. 

It remains to prove the <;= direction of (9). There are at most k gadgets whose cost 
where higher than those of WOPT. We say a gadget is served well if we payed no more 
than WOPT to serve it. We will show that if we remove the arcs corresponding to those 
gadgets, the resulting graph will be acyclic. 

Let the arcs ofCQE form a cycle in G. We have to prove that there is at least one arc 
gadget belonging to arcs in G which is not served well. For any arc e = {vi,Vj) and a 
list state L, we say e is open if we have Vi in front of Vj in L and closed otherwise. 
The arcs in C C if + are those which are closed in the initial list. In order to serve their 
gadget well, it has to be open when their gadget is served, but we cannot close them 
anymore (2). The arcs in C C if “ are open in the initial list. If we want to serve them 
well, we can not close them before their gadget is served because we cannot reopen 
them (2). 

Let us have a look at the list just after we served all arc gadgets for if + in cr. In order 
to serve all gadgets belonging to G well, all of them must be open at this time. This 
means for any arc e = {vi,Vj) in G that the item Vi must be in front of Vj in the current 
list. Because G forms a cycle, at least one of them must be closed and hence was not (if 
it belongs to if+j or will not be (if it belongs to E~) served well. This concludes the 
proof. 



5 Conclusions 

We have shown that the offline list update problem is NP-complete using a reduction 
from minimum feedback arc set to weighted list update. By changing the gadgets 
slightly, one can prove the same result also for the case where only free exchanges are 
allowed in order to update the list. 

As an open question, it remains to show whether it can be decided in polynomial time 
whether 

OPT{(j) = OPf{a). 
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Abstract. The problem of computing a largest common point set (LCP) 
between two point sets under e-congruence with the bottleneck matching 
metric has recently been a subject of extensive study. Although polyno- 
mial time solutions are known for the planar case and for restricted sets of 
transformations and metrics (like translations and the Hausdorff-metric 
under Loo-norm), no complexity results are formally known for the gen- 
eral problem. In this paper we give polynomial time algorithms for this 
problem under different classes of transformations and metrics for any 
fixed dimension, and establish NP-hardness for unbounded dimensions. 
Any solution to this (or related) problem, especially in higher dimensions, 
is generally believed to involve implementation difficulties because they 
rely on the computation of intersections between algebraic surfaces. We 
show that (contrary to intuitive expectations) this problem can be solved 
under a rational arithmetic model in a straightforward manner if the set 
of transformations is extended to general affine transformations under 
the Loo-norm (difficulty of this problem is generally expected to be in 
the order: translations < rotation < isometry < more general). To the 
best of our knowledge this is also the first paper which deals with the 
LCP-problem under such a general class of transformations. 



1 Introduction 

Let e > 0 be a real number, Q he a, transformation group (such as translations, ro- 
tations, isometry, or linear transformations), and A and B be two d-dimensional 
point sets. A largest common point set (LCP) [2,3] between A and B under e- 
congruence is a subset A' of A, having the largest possible cardinality, for which 
there exists a transformation g & Q such that the distance between the sets g{A') 
and B ' is less than e, where B ' is some subset of B, and the distance is measured 
using some appropriate metric. Related geometric problems for determining the 
similarity between two point sets have been extensively studied. Two commonly 
used metrics for quantifying the notion of similarity have been the Hausdorff 
distance [11,12,18] which is defined as the maximum distance between a point 
in one set and its nearest neighbor in the other set, and the hottleneck match- 
ing metric [14] seeks a perfect bipartite matching between two equal cardinality 
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point sets such that the maximum distance between any two matched points is 
minimized, and it returns this distance. 

A systematic study of these problems was initiated by Alt et al. [5]. They 
presented algorithms for several versions of the problem for planar point sets 
under the bottleneck matching metric. In particular, they proposed an 0(n®) 
decision algorithm to determine if there exists an isometric transformation using 
which two equal cardinality planar point sets can be brought to within e distance 
of each other under the Euclidean bottleneck matching metric. Although they do 
not mention it explicitly, it is straightforward to adapt this algorithm to compute 
the LCP of two planar point sets (not necessarily of equal cardinality), without 
incurring any increase in running time. 

Problems involving the exact matching metric, where two points are said to 
match only when the underlying geometric transformation takes one of them 
exactly on to the other have been studied in [1,3,13,21]. Given two point sets, 
the problem of computing the minimum Hausdorff distance between them has 
been studied with different parameters such as only translation and general Eu- 
clidean motion, planar and d-dimensional point sets, and the underlying metrics 
being Loo, L\ and L 2 [11,12,18]. In an effort to improve the running time, vari- 
ous approximation algorithms for either the Hausdorff or the bottleneck metric 
for point sets in two, three, and in general d-dimensions have been presented 
in [9,10,15,16,17,19,20,22]. 

Pattern matching using bottleneck metric It should be noted that most of 
the known exact algorithms, especially those involving three and higher dimen- 
sional point sets, are restricted to either the exact or the Hausdorff metric. While 
the exact metric is ill-posed for many practical applications, many problems also 
demand a one-to-one matching between the two point sets, thereby rendering 
the Hausdorff metric unsuitable in such situations [14]. This motivates the study 
of the problem using the bottleneck matching metric. However, in contrast to 
the Hausdorff metric where the distance between two point sets is in essence de- 
termined by the distance between two points each of which belongs to one of the 
sets, the existence of a bottleneck matching is a global property of the point sets 
and therefore complicates the problem. As a result, it is not apparent how the 
algorithms concerned with the Hausdorff metric can be adapted for computing 
the bottleneck matching. Neither do the algorithms of [5] extend from the planar 
case to work in three or higher dimensions in any simple way. Very recently, a 
new paradigm for point set pattern matching based on algebraic convolutions 
was proposed in [9] and [19]. This reduced the complexity of the problem under 
Hausdorff metric to nearly quadratic time. However, as noted in [20], “the one- 
to-one restriction imposed by bottleneck matching distance seems not to fit well 
within the rigid framework of algebraic convolutions” . 

Our results In this paper we present polynomial time exact (in contrast to 
approximation) algorithms for computing the LCP between two d-dimensional 
point sets under e-congruence with the bottleneck matching metric under the L 2 - 
and the Loo-norms and various classes of transformations. All of our algorithms 
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are based on the general framework of traversing an arrangement of surfaces 
or hyperplanes in some high dimensional space, based on the dimensionality of 
the point sets. This can be considered as a generalization of the concepts in the 
original algorithms for bottleneck matching presented by Alt et al. in [5] and 
in those proposed more recently for the Hausdorff metric by Chew et al. in [11] 
and [12]. 

We prove that for unbounded dimensions the problem is NP-hard. All previ- 
ous hardness results pertain to the exact matching metric and show that subset 
matehing [3] is NP-hard for unbounded dimensions [1] and computing the LCP 
of an unbounded number of point sets even in one dimension is hard [2] . Polyno- 
mial time algorithms for d-dimensional point sets (fixed d) are known either for 
the exact metric [1] or for Hausdorff metric under the Aoo-norm and restricted 
set of transformations [11]. 

Many geometric pattern matching algorithms tend to suffer from implemen- 
tation difficulties because of their reliance on algebraic techniques. As noted by 
Alt and Guibas [4] these algorithms “are probably difficult to implement and 
numerically unstable due to the necessary computation of intersection points of 
algebraic surfaces” . We show that (surprisingly) if we extend our transformation 
group to include general affine transformations (isometries are a special case of 
this) then under the Too-norm this problem can be solved using a realistic ratio- 
nal arithmetic model. To the best of our knowledge this is the first time that the 
LCP-problem is being considered under such general transformations. All of the 
previous algorithms have considered either translations, rotations, isometries, or 
at most scaling [21]. It was claimed that the algorithms in [21] generalize to 
broader classes of transformations, but these were not dealt with explicitly. 

Before proceeding further we first give a formal definition of our problem. A 
point set S is er-congruent under the bottleneck matching metric and a trans- 
formation group ^ — !■ to a point set S' if there exists a transfor- 

mation g (z G and a bijective mapping I : S S' such that for each point 
s £ S, S{g{s),l{s)) < e, where <5(-, •) denotes some metric such as L 2 or Loo (we 
will only treat these two in the sequel). Given two point sets A and B, and a 
real number e, the LGP between A and B is the maximum cardinality subset 
A' <£ A which is e-congruent to some subset of B. In what follows, e-congruence 
is always assumed to be under the bottleneck matching metric. 

We outline our basic method in Section 2. In Section 3 we present the two- 
dimensional realization of the general scheme, followed by the d-dimensional case 
in Section 4. In Section 5 we establish the NP-hardness result. 



2 Computing the LCP — the General Scheme 

Let A = {ai, . . . , a„} and B = {b\, ... ,bm} be two point sets in d-dimensional 
real space and fix any metric S on R'^. For any transformation L : R"^ ^ R"^ 
and given indices i £ [n],j £ [m ] , we define %j (L) as the set of translations that 
map the point L{ai) into the e-neighborhood of bj under the metric S. Identifying 
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a translation with its defining vector v, we can write this set as 
Tij{L) := {r; G I S{L{ai) + v, bj) < e}. 

If both, the transformation L and the translation vector v are fixed, then the 
distance between the point sets L{A) + v and B can be computed by constructing 
a bipartite graph G = {AU B, E) and computing the maximum matching in it. 
Here, for m G A and bj G B, the edge (ai,bj) G if if 6{L{ai) + v,bj) < e. 
Computing the LCP between A and B under e-congruence essentially amounts 
to finding the optimal transformation L and the translation vector v under which 
the graph G has the largest maximum matching. For isometric transformations, 
we would restrict the transformation group from which L is chosen to consist 
of only pure rotation. This restricted definition of an isometry does not result 
in any loss of generality because isometry including mirror image just increases 
the computation time of any of our algorithm by only a constant factor. An 
algorithm now has to be run once with A and B, and then with A and a mirror 
image of i? on a plane that can be chosen arbitrarily. 

LCP under translation Here, the transformation L is fixed, and we are look- 
ing for the LCP between the point sets L{A) = {L{ai) \ i G [n]} and B under 
some translation v. The ‘overlay’ of all the possible sets of translations %j{L) for 
all indices i G [n] and j G [m] forms an arrangement A{L), inducing a decompo- 
sition of the space into a number of faces. A face of A{L) is a maximal set of 
vectors with the property that any two v, v' in the set have the same relation to 
all 7ij{L), meaning that v lies inside (on the boundary of, outside) %j{L) if and 
only if the same is true for v' . Cells are faces not contained in any boundary of 
some %j{L). 

If the metric 5 is induced by the Loo-norm or the L 2 -norm, the sets Tij{L) 
are balls of radius e which are cubes in the former case and Euclidean-balls in 
the later. In the Loo-case, the cells of the arrangement A{L) are then simply 
axis-aligned boxes, while they are bounded by spherical surface patches in the 
L 2 -case. Also, the following property clearly holds. 

Property 1. Let Vopt G be some translation that enables in computing the 
LCP between L{A) and B. Then Vopt lies in some cell of A{L), and all vectors 
v' in that cell are optimal translations as well, meaning that they could also be 
used in computing the LCP. 

This property results in the formulation of a discrete version of the problem 
since the process of finding Vopt now reduces to examining just one vector in every 
cell of A{L) and computing the maximum bipartite matching in the resulting 
graph. Any vector in the cell for which the graph has the largest matching serves 
as an optimal translation. 

LCP under general transformations Now we turn to the more general 
problem of computing the LCP between A and B under any transformation L 
from some group (such as rotation, scaling, or linear transformations), followed 
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by a translation v. It is clear that as L varies, the LCP between L{A) and B 
under translation remains invariant as long as the arrangement A{L) does not 
undergo any combinatorial change, meaning that a cell does not disappear and 
no new cells appear. The different possible combinatorial structures arising from 
the arrangement A{L) as L varies, partitions the space C of all transformations 
L under consideration into a number of cells. Each cell in this case is defined 
as the maximal set of transformations generating combinatorially equivalent ar- 
rangements A{L). Therefore the LCP-problem is reduced to a cell enumeration 
problem in an arrangement in the space C. We are required to find a transforma- 
tion L in every cell of this arrangement, compute A{L) and compute an optimal 
translation for that given L by solving the LCP-problem under translation. 



3 LCP in 2-d under Isometry with the ^ 2 -Norm 

Let us assume without any loss of generality that the point set A is being rotated 
about the origin followed by a translation of the rotated set. Since the point 
sets involved are planar, the angle of rotation can be parametrized by a single 
parameter 0 G [0, 27 t), the transformation L as described in Section 2 is therefore 
rotations and the space £ is [0, 27 t). Since the underlying norm is L 2 , for each pair 
of points ai G A and bj G B, and a fixed angle 9, the set %j (9) of translations that 
take Rg{ai) (which is the point resulting out of rotating at about the origin by 
9) into the e-ball around the point bj is a circular disk of radius e in the space 
of possible translations. Overlaying all such 0{mn) circular disks for i G [n] 
and j G [m] forms the arrangement A{9). Corresponding to each cell c of this 
arrangement we can construct a bipartite graph Gc = {AU B , E) where for each 
pair of points Qi G A and bj G B, the edge (a^, bj) G E if the cell c lies within the 
disk Eij{9). We are interested in finding the graph Gc with the largest maximum 
bipartite matching over all the possible graphs corresponding to the different 
cells of A{9) arising from all possible values of 9. 

As 9 varies, the combinatorial structure of the arrangement changes as any of 
the following two conditions are fulfilled: two circular disks of A{9) touching and 
three disks meeting at a point. The first condition results in a linear equation 
in sin 9 and cos 9 and therefore can be transformed into a quadratic equation in 
either sin 9 or cos 9. The second condition results in a cubic equation in sin 9 and 
cos 9 and can be transformed into an algebraic equation of degree six in either 
of them. Since there are 0{mn) disks in the arrangement A{9), it gives rise to a 
collection of 0{nrArA) univariate algebraic equations of at most degree six. 

The solutions to the equations result in partitioning the space [0,27 t) into 
0{rrArA) intervals and each interval corresponds to a set of combinatorially 
equivalent arrangements arising out of the angles 9 lying within the interval. 
It may be observed that as we move from one interval to its adjacent one, 
the combinatorial structure of an arrangement A{9) (for any 9 belonging to the 
former interval) changes with the introduction of a new cell or the disappearance 
of an existing one. 
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Our algorithm traverses the intervals of [0, 27 t) and at each interval 

if a new cell appears in the arrangement of disks then it constructs the bipartite 
graph corresponding to this cell and computes the maximum matching in it, 
which takes 0{mny/m + n) time. The worst case overall running time of the 
algorithm is therefore 0{m^n^\/rn + n). 

4 LCP in d Dimensions 

This section gives polynomial-time algorithms for computing the LCP of two 
d-dimensional point sets, under isometries and general affine transformations, 
and under the L 2 - as well as the Loo-norm. The case of affine transformations 
with the Loo-norm is special in the sense that it can be solved with rational 
arithmetic in a straightforward manner. In the other cases, one needs to resort 
to algebraic techniques even in two dimensions [5]. 



4.1 AfRne Transformations and the Too-Norm 

Following the general scheme given in Section 2, we consider the arrangement 
A{L) for any fixed linear transformation L. For any set %j{L) = {x\,xi -I- er) x 
. . . X (xd, Xd + s)^ the values Xt,Xt + e are the first and the second coordinate of 
%j{L) in direction t. As L varies, the combinatorial structure of A{L) can only 
change at points where for some i,j, k,i, %j{L) and Tkt{L) share a coordinate 
in some direction t. It is easy to see that 



("^) — T ^2 ■ — {"c “t“ Oi | v G (1) 

where id denotes the identity transformation. Let x\^ and x^^ be the coordinates 
of Tij(id) and Tki(id) in direction t. 

According to (1), %j{L) and Tkg{L) then share a coordinate in direction t if 
and only if 

xl^ - -k (Oi - Ofe - L{ai - ak))t G {~£, 0, e}. (2) 

Considering L as a, dx d-matrix with variable coefficients xn, . . . Xdd, we see that 
conditions (2) are linear equality constraints involving the variables Xti,. ■ -Xtd- 
In other words, all combinatorial changes in A{L) can be characterized by 
hyperplanes in d^-dimensional space. Any cell in the arrangement of those 
hyperplanes corresponds to a set of linear transformations L that generate com- 
binatorially equivalent arrangements A{L). 

To find the LCP of A and B under affine transformations, we therefore pro- 
ceed as follows. 

1. Traverse all cells of TC. Because H is defined by 0{rn^n^) hyperplanes in 
dimension d^, the number of cells is in the worst case, and all cells 

can be traversed in time proportional to their number, using reverse search 
with space [6,23]. 
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2. In each cell of Ti, choose a representative transformation L (the reverse search 

algorithms [6,23] used to traverse H. actually generate such ‘points’ L, so 
there is no extra effort involved here). Traverse all cells in the arrangement 
A{L), which is an arrangement of unit cubes. The reverse search paradigm 
can also be adapted to this case within the space bounds of step 1 and 
with time ^ Por this, one can consider A{L) as a subset of cells of 

the arrangement induced by the facet-defining hyperplanes of all the cubes 
contributing to A{L). 

3. In each cell of A{L), perform the graph matching. This can be done in 

time. 

4. Among all the matchings that have been computed, return the one with 
maximum cardinality. 

The overall runtime of the algorithm is {mn)^‘^ + 0 (d) is polynomial for 

any fixed d. The space complexity is thus independent from d. 

Note that step 3 can be improved by only considering the cells of A{L) which 
have been created in the combinatorial change encountered by going from A{L') 
to A{L), where L' is a representative transformation in some neighboring cell 
of Ti. that has previously been processed. Typically, there are only few ‘new’ 
cells and they are easier to compute than the whole arrangement A{L). This 
approach is possible, because the reverse search traverses Ti along a tree of cells 
whose edges correspond to neighboring cells. However, as the complexity of this 
step is dominated by step 1 of the algorithm, we will not elaborate on this. 

4.2 Incorporating Isometries and the ^ 2 -Norm 

First let us consider the case of general affine transformations under the ^ 2 -norm. 
In this case, A{L) is an arrangement of Euclidean e-balls, whose combinatorial 
structure changes if and only if some circumscribed ball of k ball centers (2 < 
fc < d -I- 1) attains radius e. If qi, ... ,qk are those ball centers, it can be shown 
that their circumcenter c is given by c = gi -I- where qe = q£ — q±, 

and the Xe are obtained from solving the system of linear equations 
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The ball centers depend linearly on the entries of L (cf. equation (1)); 
Cramer’s rule then implies that the entries of det(M)M“^ are polynomials in 
the entries of L (det(M) itself is a polynomial too, of course). This again means 
that the condition 

k 

||^A£qff = £2 (4) 

i=2 

for a combinatorial change can be written in the form f{x) = 0, where / 
is a constant-degree polynomial (the constant being in 0(d)) in the entries 
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xii, . . . ,Xdd of the matrix L. Our strategy will be as before: we consider the 
set of all the polynomials / coming from the conditions on any k 

(2 < k < d + 1) out of the mn er-balls for the pairs of points in Ax B] these 
polynomials define an arrangement TL, whose cells correspond to combinatorial 
equivalence classes of arrangements A{L). By traversing the cells of Ti, we can 
generate all the combinatorial types that A{L) may assume; if we perform for 
each type the graph matching in all the cells, the LCP will be found. We use the 
following result to ‘compute’ Ti. 

Theorem 1 (Basu, Pollack, Roy [7]). Let V = {fi, . . . , fr} be a set of p- 

variate polynomials with rational coefficients and maximum algebraic degree s 
{p < r). Two points q,q' G are equivalent i/ sign(/£(q)) = sign{fi{q')) for 
I = The vector (sign(/i(g)), . . . , sign(/£(q)) is a sign condition ofV. 

In 0{r{r /p)P arithmetic operations, one can compute all sign conditions 
determined by V . 

Applied to our setting with r = {mn)^^'^\p = df, and s = 0{d), we can ob- 
tain all sign conditions in ^ time. The ‘full-dimensional’ sign conditions 

(the ones containing no zero sign) correspond to the combinatorial equivalence 
classes of linear transformations L we are interested in. Therefore, this solves 
our problem, once we can obtain a representative transformation L from every 
such class. 

Although full-dimensional sign conditions are always attained by suitable 
transformations L with rational coordinates, such transformations might be dif- 
ficult to find. However, all we need is the combinatorial structure of A{L), and 
this can be derived directly from the sign condition. More precisely, for every 
subset B of e-balls, we can check whether A{L) contains a point in all the balls 
of B, by checking whether all d + 1-element subsets of B have a common point 
(Helly’s Theorem). The latter information is given by the sign condition, though. 
Processing sets B by increasing size in a dynamic programming fashion, we can 
construct all sets B that define cells of A{L) in time proportional to their number 
(which is (mn)®^'^^), incurring only some polynomial overhead. 

The overall runtime (including all graph matchings) is then bounded by 

In order to handle isometries, we exploit the fact that L defines an isometry 
if and only if L~^ = (and det(L) = 1, in case we want to limit ourselves to 
rotations). These conditions give rise to 0{df) additional polynomial equations 
which we add to the ones already obtained from the combinatorial change con- 
ditions. As before. Theorem 1 can be used to deal with this case in (mn)*^*-'^ ^ 
time. 

4.3 ^ 2 -Norm and Rotations in 3-Space 

The goal of this subsection is to give a concrete bound on the exponent entering 
the runtime in case of a three-dimensional LCP problem. Here it is crucial to 
use the fact that a rotation in 3-space has only 3 degrees of freedom. More pre- 
cisely, a rotation in 3-space can be parametrized by three parameters 4>\, 4>2, 4>3, 
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and the resulting transformation matrix L has entries which are polynomials in 
sin , cos , * = 1,2,3. The combinatorial change conditions of type (4) men- 
tioned in Section 4.2, resulting from all 2-, 3- and 4-tuples of balls give rise to 
polynomial equations of some constant maximum degree in six vari- 
ables, say ri and s^, i = 1, 2, 3, where Vi = sin (pi and Si = cos (f>i. 

Consider a family of r polynomials 7^ in p variables, each of degree at most 
s, and an algebraic variety V of real dimension p' which is defined as the zero 
set of a polynomial of degree at most s. Using rP arithmetic operations 

it is possible to compute a set of points in each non-empty semi-algebraically 
connected component of V over V, along with the signs of all the polynomials of 
V at each of these points [8]. The number of such points is Applied to 

our case with the set V as the polynomials in the six variables and 

Si, i= 1,2,3, and the variety V as the zero set of the polynomial X)i=i 2 + 

sf — 1)^ we can obtain in 0{m}^n}^) time the sign conditions of 0{m}^n}^) cells 
of the arrangement defined by V . Given a fixed sign condition the corresponding 
arrangement A{L) consists of cells, (combinatorial descriptions of) 

which can be computed by dynamic programming as indicated before in 
time (for each cell we get an overhead which is at most linear in the number 
of balls). The graph matching for each cell takes 0{mn^/m + n) time, therefore 
resulting in an algorithm with total complexity + n). 



5 NP-Hardness of LCP 

In this section we prove that the LCP problem under approximate congruence 
is (not surprisingly) NP-hard, even when the transformation group is restricted 
to only translations. ^However, the proof establishes hardness also if we allow in 
addition general linear transformations, or isometries. The proof is a reduction 
from SAT. Suppose we are given a SAT formula 0 with m clauses over n variables. 
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^ David Eppstein (personal communication) independently found another proof of 
this. 
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We transform (j) into a pair of point sets A = A{(j)), B = B{(p) C R" such that 
|A| = m, \B\ = k, k > m the number of literals in (j), and with the property that 



m. This shows that LCP is NP-hard. 

For every clause Ci, i < m we create a point Oi := (i, 0, . . . , 0). For Cm, we 
define ;= (m+l,...,0). The collection of all the a^s forms the set A. 

To get the set B, we proceed as follows. For every variable xj occurring in 
clause Ci (in form of Xj or Xj) we introduce a point bij, with coordinates 



where we choose the plus sign if the variable occurs non-negated, and the minus 
sign otherwise, ej is the j-th unit vector, and e' is chosen slightly larger than e 



the set B. Fig. 1 gives an example for m = n = 3 (for better readability, the 
set B has been shifted vertically; this does not change the LCP, of course, if 
translations are among the allowed transformations). 

Theorem 2. The formula (j) is satisfiable if and only if the e-LCP of A(rj)) and 
B{(j)) has size m. 

Proof. Let us first assume 4> is satisfiable, and let Xi,i = 1 ... ,n with Xi G 
{true, false} be a satisfying assignment. Define an n-vector v by 



The claim is that v defines a translation which maps every Oi into the e- 
neighborhood of some bij, thus giving rise to a er-LCP of size m. To see this, 
consider a fixed at and let Xj be a variable whose value Xj is responsible for 
satisfying clause Ci. If Xj occurs in Ci unnegated, we have 



if e' is suitably chosen, so is indeed approximately matched with bij. The case 
where Xj occurs negated in Ci is similar. 

This argument actually holds for any Lp-norm with p > 2, in particular for 
the Loo-norm. 



4> is satisfiable if and only if the LCP of A and B has the largest possible size 




(we will see below what that exactly means). The collection of all the 6yS forms 




er'/n, if Xi = true 
e'/n, otherwise. 
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This implies 
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For the other direction, let us assume that a e-LCP of size m exists, i.e. all 
Qi are matched. We first observe that for any i, at must have been matched to 
some hij. This holds by construction, if e is sufficiently small (e 1), which we 
assume for the rest of the proof. Under translations, this is quite obvious, and 
under general affine transformations, the gap between a„_i and a„ prevents the 
only alternative matching at i = 1 . . .n. 

Now we define an assignment of truth values to variables as follows: we set 
Xj to true (false) if some is matched to bij, where xj occurs non-negated 
(negated) in Ct. If none of the two cases occur, the value of Xj is arbitrary. 
This is well-defined, because if xj occurs negated in Ci and non-negated in Ck, 
then ||(ai — Ofc) — (bij — bkj)\\ = 2e', which means that one of \\ai — bij\\ and 
||afe — bkjW must be at least e' > e. This shows that if some is matched to a 
point bij coming from a non-negated (negated) instance of xj, then this holds 
for all Qk that are matched to b^j. This also easily implies that the defined 
truth-assignment satisfies one literal in every clause. 



6 Concluding Remarks 

The main aim of this paper was to establish polynomial time bounds for solv- 
ing the LCP-problem in bounded dimensions under bottleneck matching. As a 
consequence the time bounds presented here are not the tightest possible ones. 
We believe that there is scope for reasonable improvements in this direction. For 
example, the algorithm for planar point sets presented in Section 3, which is a 
straightforward realization of our general scheme, runs in time 0(n® ®) compared 
to 0(n®) of [5]. Both the algorithms require solutions to algebraic equations of 
degree six. However, the number of events in which three disks touch at a point 
can be bounded by O(m^n^) instead of 0{m^n^), using the fact that the dy- 
namic Voronoi diagram of k sets of rigidly moving points on a plane with each 
set having m points has a complexity log* k) [12]. This reduces the worst 

case time complexity of our algorithm to 0(n^'®). For realistic point sets it would 
be faster since the number of graph matchings depend on the number of com- 
binatorial changes in the arrangement of disks as the angle of rotation varies, 
which in general would be much smaller. 
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Abstract. We consider the problem of caching multimedia streams in 
the internet. We use the dynamic caching framework of Dan et al. and 
Hofmann et ah. We define a novel performance metric based on the max- 
imum number of simultaneous cache misses, and present near-optimal 
on-line algorithms for determining which parts of the streams should 
be cached at any point in time for the case of a single server and sin- 
gle cache. We extend this model to case of a single cache with different 
per-client connection costs, and give an 8-competitive algorithm in this 
setting. Finally, we propose a model for multiple caches in a network and 
present an algorithm that is 0(A)-competitive if we increase the cache 
sizes by 0{K). Here K is the number of caches in the network. 



1 Introduction 

In the classical caching problem, requests are made on-line for pages. There is a 
cache that can hold a small number of pages. Whenever a request is made for 
a page we have a cache hit if that page is currently in the cache, and a cache 
miss if the page is not in the cache. The goal is to maintain a set of pages in the 
cache so that the total number of cache misses is minimized. 

However, for the problem of caching multimedia streams in the Internet, 
a different model is more appropriate for two reasons. First, the size of some 
streams (e.g. video streams) can be extremely large which means that it is infea- 
sible to fit entire streams in a cache. Second, one of the main reasons for using 
caches is to reduce the usage of bandwidth in the network. Hence, we are more 
concerned with the maximum number of simultaneous cache misses rather than 
the total number of cache misses. 

To address the first issue we shall consider the framework of dynamic caching 
as proposed by Dan and Sitaram [.3] and Hofmann et al. [5]. We restrict our 
attention to a single cache that can access streams from a single server. Suppose 
that there is a request for some data stream at time t\ and another request for 
the same data stream at time ^2 = ^1 + (In the terminology of [5], A is the 
temporal distance between the two requests.) Suppose also that there is enough 
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space in the cache to store A time units of the stream. Then, we can serve both 
requests using only one connection to the server (see Figure 1). We always cache 
the last A time units of the stream that were seen by the first request. The 
second request can always obtain from the cache the current stream data that 
it needs. 

In this paper we consider the problem of determining which parts of the 
streams to maintain in the cache so as to minimize the number of simultaneous 
connections to the server. We hence minimize the bandwidth required on the 
link between the cache and the server. We also consider the case of multiple 
caches in a network and propose offline and online algorithms. 



12 -tl 
Blocks 



(a) (b) 

Fig. 1. Dynamic Caching: (a) The two requests before caching, (b) After caching 
the interval t 2 — ti, the bandwidth to the server reduces by one. 



1.1 The Models 

Single cache. We begin by considering a cache and a server connected by a single 
link. There is a data stream (e.g. a video clip) of duration T stored in the server. 
We call one time unit of the clip a block, and denote by &i, & 2 , ■ ■ • , the T blocks 
of the clip. 

Requests for the clip arrive in an online fashion. We denote by ti the time 
at which client i makes a request for the entire clip. Without loss of generality, 
tl < t 2 < ■ ■ ■■ We assume that the client requests must be serviced without delay, 
i.e., client i must receive the block bj at time ti + j — 1. 

The cache has a buffer that can store B blocks of the clip. At time ti+ j — 1, 
if block j is in the cache, then client i can obtain it without cost (we have a 
Cache Hit.) If block bj is not in the cache then it must be obtained from the 
server at a cost of one unit of bandwidth on the cache-server link (we have a 
Cache Miss.) In the latter case, when block bj is obtained from the server, we 
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are allowed to place it in the cache, subject to the condition that we never cache 
more than B blocks. 

Our goal is to determine which blocks of the clip should be cached at any 
point in time so as to minimize the maximum bandwidth that is ever used on 
the link. That is, we wish to minimize the maximum number of simultaneous 
cache misses. More specifically, let be an indicator variable that equals 1 if 
cache i has a cache miss at time t and 0 otherwise. The bandwidth used at time 
t is We wish to minimize maxt M/ . 

Throughout this paper we shall focus on interval strategies. We define the 
interval i to be the set of blocks lying between the block currently required by 
request i — 1 and the block currently required by request i. Hence at time t 
interval i consists of blocks bt-ti+i, ■ ■ ■ , • An interval strategy always tries 

to cache entire intervals. 

We emphasize that our algorithms must be online in the sense that they 
cannot know about the request from client i until time ti. However, since the 
request is for the entire stream of blocks, the algorithm will know at time ti 
that block bj will be needed at time ti + j — 1. We shall use the framework 
of competitive analysis [13] and compare our candidate algorithms against the 
optimal offline algorithm that knows all the requests in advance. 

Single eaehe with per-client connection costs. We next consider a more general 
model in which we have per-client costs. We assume that for each client there 
is a cost di for connecting the client to the cache, and a cost Si for connecting 
the request directly to the server. These costs reflect the bandwidth required by 
these connections. There is also a cost c for connecting the cache to the server. 
For each block required by the client, if the block is present in the cache it can 
be obtained at a cost of di. Otherwise, the block can either be obtained via the 
direct connection to the server at a cost of Si or else it can be obtained from the 
server and brought into the cache at a cost of c -I- di . Our goal is to minimize the 
maximum cost incurred at any time instant. 

Multiple caches In our most general model we have many caches, each of which 
could be placed in a different part of the network. Cache j has size B and can 
be connected to the server at a cost of Cj . Client i can obtain a block from cache 
j at a cost of dy . If the block is not cached anywhere then client i can obtain it 
from the server and bring it into cache j at a cost of cj + dij.^ 

Multiple clips Although the results we derive assume that the server has just 
one clip, the discussion can be easily extended to the case where the server has 
many clips. 

1.2 Results 

— In Section 2 we consider the single cache problem where we have a cache- 

server link and we wish to minimize the required bandwidth on this link. Let 

^ Note that we do not need to consider the cost of connecting to the server directly 
since we can assume that there is a cache at the server with Bj — 0 and Cj = 0. 
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U be the bandwidth of the optimal offline algorithm (that is not necessarily 
an interval strategy) . We first present an interval strategy in which we always 
cache the smallest intervals that will fit into the buffer. We show that this 
strategy requires bandwidth at most U + 1. We also consider how to carry 
out this strategy when the requests arrive online. We first show how to do 
this when we have auxiliary buffer of size B in which we cache the first B 
blocks of the clip. We next show how to dispense with the auxiliary buffer at 
the expense of two additional connections to the server, i.e. the bandwidth 
required is U + 3. 

~ In Section 3 we analyze the single cache problem where we have per-client 
connection costs. We present an online algorithm that is 8-competitive with 
respect to the optimal offline interval strategy. 

— In Section 4, we study the multiple cache problem. Let K be the number of 
caches. We first present an integer programming formulation for the problem 
of choosing the best set of intervals to store at each cache. We show that the 
solution of the LP-relaxation can be rounded to give an integral solution in 
which all caches are increased by a factor of 4itT and the connection costs 
are increased by a factor AK. We also show that if the size of cache j is 
increased once more to 12KB then we can obtain an online algorithm that 
is 12iL-competitive with respect to the optimal offline interval strategy. 



1.3 Previous Work 

There has been much work on the classical caching problem in which requests 
are for arbitrary blocks and we wish to minimize the aggregate number of cache 
misses [1,2,4,9,13]. For example, Sleator and Tarjan [13] showed that the Least- 
Recently-Used (LRU) protocol is R-competitive (i.e. the number of cache misses 
is at most B times the number of misses due to the optimal offline algorithm). 
However, the problem of caching streaming data in a network has received less 
attention. 

Dan and Sitaram presented the framework of dynamic caching in [3]. They 
propose an interval strategy similar to the algorithm we present in Section 2 
in which we aim to cache the intervals that require the smallest buffer space. 
However, Dan and Sitaram do not present a theoretical analysis of this approach 
and they do not consider how to carry out the algorithm in the online setting. 

Hofmann et al. [5] present heuristics for dynamic caching when there are 
many co-operating caches in a network. The caches store different sections of 
the stream and control processes called helpers decide which cache (s) to use for 
any particular client so that connection cost is minimized. 

Sen et al. [11] study the benefits of caching as a method to minimize the delay 
associated with the playback of a multimedia stream. They advocate caching the 
initial parts of a clip so that the playback can be started earlier. 

More generally, the problem of caching in distributed environments such as 
the Internet is receiving increasing attention. Li et al. [8] present algorithms 
for the placement of caches. Karger et al. [7] and Plaxton and Rajaraman [10] 
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give methods for the placement of data and the assignment of client requests to 
caches. 

2 Single Cache 

We begin by considering the single cache model in which we wish to minimize 
the required bandwidth on the cache-server link. Let Ai = — ti. We call the 

We include the interval between the first request and the most recently expired 
request in our list of intervals. We denote the length of this interval by Aq. 

At any time instant t, let < Ai^ < . . . Ai , and let Nt = max{n\ 
< B}. Note that Nt changes with time as new intervals get added 
and old ones cease to exist. 

We first assume that the cache has an auxiliary buffer of size B, in addition 
to its regular buffer. We present a caching strategy that we call the Greedy- 
Interval-Strategy. It is similar to the caching scheme proposed in [3], and 
can be described as follows: 

(1) Basic Strategy: Assume the current time instant is t. Identify the Nt 
smallest intervals in the current set of requests. Merge adjacent intervals 
in this set to create super-intervals. For a super-interval of length L, connect 
the request that came earliest in time to the server, and allocate L blocks 
of the cache to this super-interval. Store the last L blocks of the connected 
request in this space. Service all other requests in that super-interval from 
the cached blocks. 

(2) Dying streams: When a request i expires, we say that the interval between 
request i — 1 and i has died. This may cause the set of Nt cached intervals 
to change. To maintain this set online, we do the following. After request 
i — 1 expires, if the dying interval is one of the Nt smallest intervals then we 
only cache that part of it which is required by the request i. If the current 
time is t, then we cache only the last ti-\-T — t blocks of the dying interval 
in the cache. Note that if the dying interval is evicted from the cache by a 
newly arriving interval, it will never get re-inserted, as re-insertions occur 
only when an interval dies. 

(3) Vacant Space: At any instant of time, let S be the number of blocks of 
cache that are not used by any interval. We cache the first 5 part the smallest 
uncached interval in this space. We will show below that this can be done 
online. 

(4) New Requests: When a request arrives, it creates a new interval. We re- 
compute Nt and find the Nt new smallest intervals and store these in the 
cache. Note that except the new interval, we do not add any interval to the 
existing set of intervals. Since we have the first B blocks of the clip in the 
auxiliary buffer, it is trivial to add the new interval. 

The following theorem shows that Greedy-Interval-Strategy is a valid 
online strategy. 
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Theorem 1. When an interval is added to the set of cached intervals, it is 
already present in the cache. 

Proof. We prove this by induction on time. Assume the claim is true at time 
t, i.e., the Nt smallest intervals are present in the cache. The only events that 
change the set of cached intervals are listed below. We will show in each case 
that the new set of Nt smallest intervals are already present in the cache. 

1. A new interval arrives. Since we cache the first B blocks of clip on the 
auxiliary buffer, we have this interval in cache when it arrives. 

2. An interval expires when both streams corresponding to its end-points die. 
This may cause the interval iNt+i to get cached. But in this case, by step 
(3) of the algorithm, we would have cached this interval completely in the 
vacant space. 

Note further that once a dying interval is removed from the cache by a newly 
arriving interval, it will never be re-inserted. 



2.1 Competitive Analysis 

The performance metric we use is the maximum bandwidth used by the algo- 
rithm. We compare it with an optimum offline strategy, which we call Opt. Let 
us assume that Opt uses U units of bandwidth on a sequence of requests. We 
can explicitly characterize Opt. It looks at all the X requested blocks at at any 
instant of time t, and caches those X — U blocks for which the previous requests 
were closest in time. Note that these blocks must have been cached at some time 
in the past, so the algorithm is effectively looking into the future to decide which 
blocks to cache. The algorithm is successful if and only if it caches at most B 
blocks at any instant of time. We state the following results without proof. 

Lemma 1. Given any sequence of requests, if it is possible to cache the blocks in 
such a way that the maximum bandwidth required is U , then Opt, as described 
above is successful. 

Lemma 2. If Opt caches the block required by request ij at time t then it must 
also cache the block required by ib for all j' < j . 

We now compare the performance of Greedy-Interval-Strategy with 
the bandwidth-optimal strategy Opt. 

Theorem 2. If Greedy-Interval-Strategy uses bandwidth U at time t, 
then there exists time t' > t at which Opt uses bandwidth U — 1. 

Proof. Suppose that Opt uses bandwidth U — 2 throughout the interval [t, t + 

^iNt + l ) ■ 

There are two cases to consider. 
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— Case 1 All but one of the streams that are not cached by the Greedy- 
Interval-Strategy at time t are still in the system at time t + 

Then, since Opt uses bandwidth U—2 during the time period [t,t + 

one of these streams must be cached by Opt throughout the time period 
[t, t + By Lemma 2, this means that Opt caches Nt + 1 intervals at 

time t. But, if we consider the Nt + 1 smallest intervals at time t, we have 
Ai- > B (by definition of Greedy-Interval-Strategy). This is a 
contradiction. 

— Case 2 Two or more streams that are not cached by the Greedy-Interval- 

Strategy at time t die before time t + However, by the definition 

of the Greedy-Interval-Strategy, the gap between these streams must 
be at least This is a contradiction. 

Note that if a stream is not cached at time t and dies during t + then 

it must be the earliest stream in the system at time t. 

Corollary 1. The Greedy-Interval-Strategy requires at most one more 
unit of bandwidth than Opt. 

2.2 Removing the Auxiliary Buffer 

We now present a scheme Lookahead-Greedy that removes the auxiliary 
buffer of size B. First of all we modify Greedy-Interval-Strategy so that 
we only begin to cache an interval when there is enough free buffer space to 
cache the entire interval. This modification creates one extra connection to the 
server. 

Suppose that a new request arrives at time t. If the new interval (of length 
5, say) needs to be cached, we start caching the stream corresponding to the 
previous request starting from time t. By the above comment there is free buffer 
space equal to 5 at time t. This space is completely filled up only at time t + 5. 
Therefore, at time t + ^, there is unused space of size 5 — 7. We cache as many 
blocks of the beginning of the new stream in this space as possible. Note that 
the cached section of the new stream increases from 0 to | blocks, but then 
decreases to 0, as the free space decreases. 

Suppose that the next interval that must be cached arrives at time t + The 
size of this interval is at most qt. There are three cases depending on the value 
of /r: 

1. If /r < |, we have the beginning /i blocks in the cache, and we can serve the 
new stream directly from the cache, as in Greedy-Interval-Strategy. 

2. If 5 > /i > |, we will have the first S — ^ blocks in the cache. This means 
that we can serve the new stream from the cache for the interval [t + fi,t + S]. 
At time t + S the first interval will be fully cached. 

3. If /i > S, the new stream does not arrive until the first interval is fully cached. 

This implies that Lookahead-Greedy requires at most one more connec- 
tion to the server than the modified version of Greedy-Interval-Strategy. 
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Recall that the modified version of Greedy-Interval-Strategy requires one 
more connection than the original version. We have the following result. 

Theorem 3. If Opt requires maximum bandwidth U on any sequence of re- 
quests, Lookahead-Greedy requires bandwidth at most U -\-3. 



3 Single Cache with Connection Costs 

We will now consider online algorithms in a more general cost model than the 
one used above. If some request is not part of any super-interval then it is not 
necessary for that request to be routed via the cache. 

In this new model any client can either connect to the server directly or can 
connect via the cache. We assume that the connection cost per unit bandwidth 
from the cache to the server is c. (see Figure 2). Furthermore, for each request 
i we pay cost di if we connect the request to the cache. We pay Si if we connect 
the request directly to the server. Without loss of generality we assume that 
di < Si < c-\- di. 



SERVER 




Link cost for 
client i = si 



CLIENTS 



Fig. 2. The cost model for a single server and single cache. Note that 
di < Si < c-\- di. 



We shall compare our online scheme only against schemes which cache com- 
plete intervals from those formed by the active requests, at any instant of time. 
We call such schemes Interval Caching schemes. We do not consider offline al- 
gorithms that are not interval strategies. This does not matter as long as we 
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assume that the caching policy at a cache is oblivious to the algorithm that as- 
signs requests to caches. More precisely, the caching protocol assumes that the 
requests currently assigned to the cache will remain assigned to that cache until 
they die. 

The cost of an interval caching scheme at any time instant is simply the total 
cost of the bandwidth used in the cost model described above. Our objective is 
to derive an online strategy that is competitive (with respect to the maximum 
cost needed by any interval caching scheme) on a given sequence of requests. 

As before, at any time instant, we number the requests 1, 2, . . . , n according 
to the order in which they arrive. A super-interval is a contiguous set of cached 
intervals, each of which is serviced by one connection to the server. Note that 
the cost of connecting one super-interval to the server is c, irrespective of the 
number of requests cached within it. 

If there are n active requests at any instant of time, we can compute the 
optimum interval caching strategy^ for that time instant using dynamic pro- 
gramming. The running time of this dynamic program is 0(ri^ ^ Si). However, 
we can obtain a polynomial time approximation scheme for this problem by 
scaling all the costs. 



3.1 An Online Algorithm 

We now show how to obtain a 8-competitive online algorithm with respect to the 
optimum interval caching strategy for this problem. The algorithm is very simple 
to describe. We periodically compute the optimum, and try to cache the intervals 
corresponding to it. We call this strategy the Greedy-Prefetch strategy. 

Compute Optimum For the given set of requests, we compute the optimum 
solution. This solution may differ in some super-intervals from the previous 
optimum. We remove all super-intervals in the old optimum but not in the 
new one from the cache, and start caching the new super-intervals. 

Greedy Caching For a new super-interval p, let Lp be the total uncached 
length. Note that this length need not be contiguous. Let nip be the point 
on this super-interval where the length of the uncached portion is exactly 
Let the two halves be Xp and Yp. Let Sp be the current total cost of 
connecting the requests in super-interval p to the server. Among Xp and Yp 
we find that half with the larger current connection cost, and start caching 
that half of the super-interval with one connection to the server. 

Prefetch When we compute the optimum and start caching the new super- 
intervals, the time it would take for the solution to stabilize is the size of the 
largest uncached part of any super-interval. Let us denote this amount of 
space as L. In this space, we cache the beginning of the clip with one extra 
connection to the server. We can cache at most [-|j blocks of the clip before 
this space starts to diminish in size. For any new request arriving within time 
[■jJ of computing the optimum, we serve it from the cache. Note that the 

^ This problem is NP-hard, with a reduction from the Knapsack problem [6]. 
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request arriving at time [-j] after computing optimum must be connected 
directly to the server. We can do this at an extra cost of c. 

Recompute Optimum We recompute the optimum after time [ of the pre- 
vious computation, and repeat the whole process. Note that if the optimal 
solution does not cache anything, we recompute the solution at the next 
request arrival. 



3.2 Competitive Analysis 

We will now show that Greedy-Prefetch is competitive with respect to the 
maximum optimum interval caching cost on any sequence of requests. Denote the 
times at which we compute the optimum solution by xi, X 2 , . . .. Let the cost of the 
optimum interval caching strategy at time Xi be M{i), and the cost of Greedy- 
Prefetch be G{i). Also, let S{i) be the total excess cost of connecting requests 
within a new super-interval to the server at time Xi. We state the following 
results without proof. 

Lemma 3. For the functions G, S and M as defined above, the following are 
true: 

1. S'(f-hl) < 3M{i)+[^\. 

2. G{i) < 2M{i) + S{i). 

3. The cost G attains its maximum at one of the times Xi . 

Theorem 4. Given any sequence of requests, if the optimum interval caching 
strategy can service them within a cost of C, Greedy-Prefetch requires cost 
at most 8C to service them. 



4 Multiple Caches 

The most general version of this problem can be stated as follows. We have a 
single server, K caches and n clients in a network. The caches have capacity B 
and the connection cost per unit bandwidth to the server from cache j is Cj . The 
cost for connecting client i to cache j is dij. We first present an offline 0{K)~ 
approximation algorithm^ for finding the optimum set of intervals to cache if we 
are allowed to increase each buffer size by a factor 0{K). We then show how 
to convert the offline scheme into an online scheme if we are allowed a further 
increase in the cache size. 



4.1 Offliue Approximatiou Algorithm 

The server stores a single clip. Client i requests the clip at time ti, and we 
assume that t\ <t 2 < ■ ■ ■ <tn- Let Ai = ti — U-i. The objective is to compute 

® The offline problem is NP-hard, with a reduction from the Generalized Assignment 
Problem [12]. 
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the optimum set of intervals to store in each cache so that the total cost of the 
bandwidth used is minimized. 

The above problem can be formulated as an IP as follows. Let Xij be equal 
to 1 if request i is served via cache j, and 0 otherwise. Let be equal to 1 
if request i is connected to the server via cache j. Let Zij be equal to 1 if the 
interval [ti-i, ti] is present in cache j. Note that request i need not be served by 
this interval^. 

K n 

Minimize EE dijXij 

= ^ Vi e {l,2,...,n} 

yj€{l,2,...,K} 

Zij < Zi-ij + yi-ij \/i,j 

Xij ^ Zij -\- yij Vi , j 

^ij : Uij-> Zij G {0,1} 

Definition 1. A solution to the above IP is a (a, /3)-approximation if the 
cost of the solution is at most a times the cost of the optimal fractional solution 
(obtained by relaxing the IP), and the required buffer space cache j used is at 
most (3 ■ Bj for all j. 

We first show a lower bound on the approximability via this formulation, 
without proof. 

Theorem 5. If the above IP is (a, [3)-approximable by rounding its LP relax- 
ation, then af3 > K . 

We will now show how to obtain a (4iV, 4iV)-approximation by rounding the 
relaxation of the IP described above. For each cache j, look at the variables in 
increasing order of i, applying the following rules. We denote the rounded Xij as 

^ijt Uij ^ij'i Zij as ^ij- 

1. Xij > ^ ^ Xij = 1, else Xij = 0. 

2. Zij ^ 2 i 7 ’ — 4 lc ^ else Zij — 0. 

Vij zl 2 k ’ Zij-\j — 1 and Zij — 0 — 1, else Yij — 0. 

Theorem 6. The rounding scheme described above gives a {AK,AK)~ approxi- 
mation. 

4.2 Online Scheme 

We now show how to convert the offline algorithm into an online scheme. Suppose 
that we increase the buffer size at cache j to 12KB. We compute the approximate 
offline solution every 4,KB time steps. We use the additional buffer space as 
follows. At each cache we store the first 4:KB blocks of the clip. Any new request i 
is served from the cache j with the smallest dij until the optimum is re-computed. 

For any super-interval, the first and last requests will be served by that super- 
interval. However, intermediate requests may not be served. 
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5 Open Problems 

The most interesting question is whether it is possible to improve the approxi- 
mation factor in the multiple cache offline problem. Another interesting question 
is whether it is possible to obtain online schemes in the multiple cache setting 
without blowing up the cache size. Finally, it would be interesting to see if it is 
possible to extend these schemes to the case where the requests are not for the 
complete clip, but for parts of it®. 
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Abstract. Given a class C of Cayley graphs, and given an edge-colored 
graph G of n vertices and m edges, we are interested in the problem of 
checking whether there exists an isomorphism (j> preserving the colors 
such that G is isomorphic by 0 to a graph in C colored by the elements 
of its generating set. In this paper, we give an 0(m log n)-time algorithm 
to check whether G is color-isomorphic to a Cayley graph, improving a 
previous logn) algorithm. In the case where C is the class of the 

Cayley graphs defined on Abelian groups, we give an optimal 0(m)-time 
algorithm. This algorithm can be extended to check color-isomorphism 
with Cayley graphs on Abelian groups of given rank. Finally, we propose 
an optimal 0(m)-time algorithm that tests color-isomorphism between 
two Cayley graphs on i.e., between two circulant graphs. This latter 
algorithm is extended to an optimal 0(n)-time algorithm that tests color- 
isomorphism between two Abelian Cayley graphs of bounded degree. 



1 Introduction 

Checking whether two input graphs with n vertices and m edges are isomorphic is 
known as the graph-isomorphism problem and is a famous combinatoric problem. 
For general graphs the graph-isomorphism problem is known to be in NP, not 
known to be in P, and probably is not NP-complete (see Section 6 of [3]). A 
relaxed problem is to recognize classes of graphs, that is to check whether a 
given graph G is isomorphic to an element of a specified class of graphs (e.g., 
interval graphs, chordal graphs, planar graphs, etc.). Depending on the class of 
graphs, this problem can be easy (e.g., checking planarity). However, for many 
classes, checking whether a graph belongs to that class remains difficult. This is 
particularly true for Cayley graphs [12]. 
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A graph G = (V,E) is Cayley if there exists a group F and a generating set 
S of r such that V is the set of the elements of F, and there is an arc from x to 
y in G, that is {x, y) G E, if and only if there exists s € S such that y = s * x, 
or, in other words, if and only if y * x~^ G S. (In the following, we will use the 
additive notation for the operation * of A). If G is Cayley, then G is denoted by 
the pair (F,S). If S is involutive, that is if (s € 5” s~^ G S) holds, then G 

is a symmetrically-oriented directed graph which can be seen as an undirected 
graph. The regularity and the underlying algebraic structure of Cayley graphs 
make them good candidates for interconnecting nodes of a network [11]. For 
instance. Cycles, Tori, Hypercubes, Cube-Connected Cycles (CCC), Butterflies 
with wraparound. Star-graphs, Bubble Sort graphs. Pancake graphs. Chordal 
rings (i.e., circulant graphs) are all Cayley graphs. Some of them are defined on 
Abelian groups (e.g., hypercubes and tori), whereas some others are defined on 
non- Abelian groups (e.g.. Star-graphs, and Pancake graphs). 

The complexity of graph-isomorphism for Cayley graphs is a challenging 
problem [12,14]. Currently, the problem is still open even if we restrict our at- 
tention to circulant graphs [2,7], i.e., Cayley graphs defined on Z„. The problem 
is known to be polynomial only for specific classes of circulant graphs, e.g., those 
with a prime number of vertices [6,15]. In this paper, we study a variant of the 
graph-isomorphism problem. Given two (properly or not) arc-colored digraphs 
Gi = (Vi,^!) and G 2 = (V 2 ,if 2 ), these two digraphs are said color- is amorphic 
if there exists an isomorphism from Gi to G 2 that preserves the colors. More 
formally, let Gi be the set of colors of Gi, and G 2 be the set of colors of G 2 . 
Gi and G 2 are color-isomorphic if there exist a one-to-one mapping </> : Vi — > V 2 
and a one-to-one mapping ^ : Gi — > G 2 such that for every c G Gi, and for every 
(x, 2 /) G Vi X Vi: 

{x, y) G El and is of color c {(p{x) , cj){y)) G E2 and is of color tp{c) . 

Graph-isomorphism is the particular case of the color-isomorphism problem ob- 
tained by choosing Gi = G 2 = {c}. Conversely, one can construct a polyno- 
mial reduction from color-isomorphism to graph- isomorphism^. Therefore, the 
two problems are equivalent. That is why we will focus on specific colorings of 
graphs to solve the color-isomorphism problem. In particular, the generating 
set of a Cayley graph {F, S) induces a coloring of the edges, simply by col- 
oring the edge (x, s -I- x) by the element s G S. Such a coloring is called the 
natural edge-coloring of a Cayley graph. As we show in this paper, the color- 
isomorphism problem between naturally colored Cayley graphs is apparently 
simpler than graph-isomorphism, or color-isomorphism, between arbitrary col- 
ored Cayley graphs. 

The notion of color-isomorphism applied to naturally colored Cayley graphs 
was introduced to characterize edge-labeled regular graphs that have so-called 
minimal sense of direction [8]. In [9] it is shown that such regular graphs are 

^ In fact, we will show in the full version of the paper that there is a polynomial 
reduction from color-isomorphism of Cayley graphs to Cayley isomorphism. 



78 



Lali Barriere et al. 



exactly those that are color-isomorphic to a Cayley graph whose edges are col- 
ored by the elements of its generating set. Boldi and Vigna proposed in [5] 
an log n)-time algorithm to check whether a colored graph is color- 

isomorphic to a naturally colored Cayley graph. A parallel version of this algo- 
rithm runs in O(logn) time with n® processors. 

Note that there are major differences between, on one hand, the problem of 
checking whether a given graph is isomorphic (or color-isomorphic) to a Cayley 
graph of a specified class, and, on the other hand, the problem of checking 
whether a given group is isomorphic to a group of a specified class [13]. For 
instance, an 0(n)-time algorithm exists for Abelian group isomorphism [16], but 
the addition table of the group is given as input of the problem. In the color- 
isomorphism problem, the algebraic structure of the graph is hidden, and it may 
not be possible to perform elementary group operations in constant time. In this 
paper we show how to extract the structure of the underlying group of colored 
Cayley graphs. This is done by a pure graph and algorithmic approach. 

Results and structure of the paper 

1) In Section 3, we present an 0(mmin{fc,logn}) = 0(m log n)-time al- 

gorithm that checks whether a colored graph of degree k given as input is 
color-isomorphic to a naturally colored Cayley graph, improving a previous 
q(j.j 4.752 algorithm. 

2) In Section 4, we give an 0(m)-time algorithm that checks whether a col- 

ored graph given as input is color-isomorphic to a naturally colored Cayley graph 
defined on an Abelian group. This algorithm trivially extends to an algorithm 
that, given any integer r and any graph G, checks whether G is color-isomorphic 
to a naturally colored Cayley graph on an Abelian group of rank r. As a con- 
sequence, we can check in 0(m)-time whether a colored graph given as input is 
color-isomorphic to a naturally colored circulant graph (r = I). The decrease of 
the complexity from 0(mlogn) for arbitrary Cayley graphs to 0(m) for Abelian 
Cayley graphs is obtained by a sophisticated technique used to perform arith- 
metic computations in x • • • x Y\a=i with the standard 0(log n)- 

word RAM model [10]. 

3) Finally, in Section 5, we give an 0(m)-time algorithm that checks color- 
isomorphism between two naturally colored circulant graphs given as input. This 
algorithm can be generalized to produce an 0(n)-time algorithm that checks 
color-isomorphism between two bounded degree naturally colored Abelian Cay- 
ley graphs. 

Note that our 0(m)-time algorithms are optimal since, in the worst-case, we 
have to check the m edges of the input graph. As far as the sense of direction 
theory is concerned, our results have the following consequence: 

Corollary 1. There is an O {m min{k^ log n}) -time algorithm which, given any 
regular graph and any edge-coloring of this graph, returns whether this coloring 
gives G a minimal sense of direction. 

The next section gives preliminary results that are helpful for solving our 
problems. 
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2 Notation and Preliminary Results 

The model. In each of our problems, we assume that the input graph is a digraph 
(that is a strongly connected directed graph). Again, a graph is a particular case 
of this setting by viewing the graph as a symmetrically-oriented directed graph. 
The input arc-colored digraph of order n is described by its adjacency list C, 
including the colors. The vertices are labeled from 1 to n, and, for every vertex 
X, L[x] is the list ((j/i, ci), . . . , (j/fc, Cfe)) of the k out-neighbors of x together with 
the colors of the arcs linking x to these out-neighbors. Each color Ci is an integer 
taken in the set of colors C = {1, . . . , fc}. Actually, for /c-regular digraphs (that is 
where each vertex is of in- and out-degree fc), the list can be replaced in O(nfc) 
time by a n X fc adjacency table T so that, for every vertex x and for every 
i € {1, . . . ,fc}, T[x,Ci] returns in constant time the t-th neighbor yi of x such 
that the arc (x, yi) is colored c^. Therefore, for a fc-regular digraph of n vertices, 
the size of the input is nk integers taken from the set {1, . . . , n}. 

Note that one can check in 0{nk) time whether a digraph is fc-regular, and 
whether a digraph is properly colored by fc colors (that is every vertex has its 
fc incoming arcs colored with fc different colors, and its fc outgoing arcs colored 
with the same fc different colors). Therefore, unless specified otherwise, we will 
always assume in the following that the input digraph is fc-regular, and properly 
colored with fc colors. 

Notation. Let xq be an arbitrary vertex of a properly colored digraph G = {V, E) . 
Since we are checking whether G is isomorphic to a subclass of Cayley digraphs, 
one can assume that G is vertex-transitive. Indeed, if this assumption happens to 
be false, it will create a contradiction further at some stage of the algorithm, and 
therefore it will be detected. Thus, we can assume, w.l.o.g., that xq is the zero 
of the underlying group. As stated previously, the elementary group operations 
are not directly available to be performed in constant time given the digraph 
G only. Indeed, the (supposed) algebraic structure of the digraph is hidden in 
its adjacency table. So, let us redefine standard group operations applied to 
digraphs. 

The concatenation of two words w and w' whose letters are taken in the set 
of colors C, that of two words w and w' of C*, is denoted by w0w'. Let S be any 
directed tree spanning G, and rooted at xq . We can label each vertex x of G by a 
sequence of colors according to the color of the arcs encountered in the path from 
Xq to X in S. More precisely Xq is labeled i{xo) = e (the empty sequence), and 
every vertex x Xq is labeled £{x) = i{y) ® c, i.e., the concatenation of the label 
of the father y of x in S, and of the color c of the arc (y, x). Once this labeling 
is fixed, one can define the orbit of a vertex x Xq as the ordered sequence of 
vertices yo,yi,y 2 , . ■ . where yo = xq, and, for i > 0, j/i is the vertex reached from 
yi-i by following the sequence of colored arcs given by the sequence of colors 
of i{x). In particular yi = x. G being properly colored, one can then define the 
order of a vertex x y^ xq as the smallest integer t > 0 for which yt = xq. The 
order of x is denoted by o(x), and o(xq) is set to 1. 
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By extension, the order of a color c, denoted by o(c), is defined as the order 
of the out-neighbor Xc of xq such that the arc (xq,Xc) is colored c. The orbit of 
c is the orbit of Xc- Note that the orbit of a color is an elementary cycle in G, 
whereas the orbit of an arbitrary vertex x may not be a cycle, in particular if x is 
not an out-neighbor of xq- Finally, we define the addition of vertices. Given two 
vertices x and y, x + y is defined as the vertex reached from x by following the 
path £{y). The orbit of a vertex x passing through y is simply the sequence 
y,y + x,y + 2x,... 

Correspondence tables. Computing the order of a color c can be done in O(o(c)) 
time by simply following from xq the path composed of arcs of color c. Adding a 
color c to any vertex y takes 0(1) time using the adjacency table T, because the 
vertex y -|- c is given by T[y, c]. However, computing the order of an arbitrary 
vertex x looks harder since following the path of color £(x) from a vertex y 
requires \£{x)\ accesses to the adjacency table, where \£{x)\ denotes the length 
of sequence £{x). Nevertheless, this complexity can be reduced if we restrict our 
study to the case £{x) = c®- ■ - ®c, with \£{x) \ = a. One can compute the vertex 
y -|- ac in 0{a) time using T, but one can do better as shown in the following 
lemma: 

Lemma 1. There is an 0{nk)-time and 0{nk)-space algorithm that, given a 
properly colored k-regular digraph G of order n described by its adjacency table 
T , either returns false, and then G is not a Cayley digraph, or returns a data 
structure such that, if G is color-isomorphic to a naturally colored Cayley digraph 
(r,S), then it enables computation of vertex x -\- as in constant time, for any 
X G r, any integer a > 1 and any s & S . 

Proof. Let us first assume that G is a naturally colored Cayley digraph (T, S) 
of order n given by its adjacency table T. First, let us describe the structure, 
and show how to compute z = y-\-ac in constant time using that structure, that 
is a vertex reached from y by following the path £(ac) = c ® ■ ■ ■ ® c composed 
of a times the arc colored by c. Then we will show how this structure can be 
constructed in 0{nk) time. For every c G C, that is for every color of the input 
digraph, G can be viewed as Uc = n/o(c) c-colored orbits of length o(c) with 
arcs of the other colors between the orbits. Let us concentrate on the Uc orbits 
of c. Arbitrarily order these orbits from 1 to ric, and, inside each orbit, order 
the vertices of this orbit from 0 to o(c) — 1 by starting at any vertex, and by 
respecting the natural order induced by the orbit. We get an (n x fc)-table T, 
and k tables Tc, one for each color c S G, of respective size Uc x o(c). Tc[i,j] is 
the vertex x at the j-th position in the i-th orbit of c. Conversely, T[x, c] returns 
the pair (i, j). Computing z = y -\- ac can be done in constant time using the 
table T and the tables Tc, c € G, as follows: 

— T[y, c] specifies that y is the j-th element of the i-th orbit of c; 

— z = Tc[i, (j -I- a) mod o(c)]. 

The construction of T and Tc,c G G, can be done in 0{nk) time as follows. 
There is one phase for each color c G C. Once c is fixed, we visit all the vertices 
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and construct the Uc orbits of c (the visit of the first orbit allows to set o(c)). 
This is done in ric steps, one step for each orbit. At each step, say at step i, an 
arbitrary not yet visited vertex x is chosen. T[x, c] is set to (f, 0), and Tc[i., 0] is 
set to X. Visiting the o(c) — 1 remaining vertices x\,. . . ,Xo(c)-i of the current 
orbit allows to set T[xj,c\ to (i, j), and Tc[i,j] to Xj, for all j, 1 < j < o(c). The 
total cost of this construction is 0 {nk) because there are k = \C\ phases, and, at 
each phase, every vertex is visited exactly once because G is a Cayley digraph. 

Given an arbitrary properly colored fc-regular digraph G, one can apply the 
previous construction. This construction may fail if G does not have enough 
symmetry, and, conversely, it may succeed even if G is not a Cayley digraph. 
Nevertheless, for any digraph such that, for every color c, the set of vertices 
can be decomposed in ric c-colored orbits of the same length, the construction 
will succeed. Since G is properly colored, the only thing that must be checked is 
whether all orbits of the same color have the same length. The algorithm returns 
FALSE if it fails. This checking can be done on the fly, during the construction 
of the tables, with no additional cost. □ 

Definition The tables resulting of the proof of Lemma 1 are called the corre- 
spondence tables of the colored digraph G. 

3 Recognizing Cayley Colored Digraphs 

Recall that G = {1 , . . . ,k} is the set of colors. For any vertex x, we define the 
equivalence relation =x on words in C* by wi =x W2 if and only if x-\-w\ = x-\-W2- 
Then we define the equivalence relation ~ on vertices by Xi ~ X2 if and only if 
=3,1 and =3,2 are identical. 

Lemma 2 . A k-regular connected properly colored digraph G = (V, E) is a Cay- 
ley digraph if and only if V\ ~ V2 holds for every pair (^1,^2) of vertices ofV. 

Proof. The “only if” part is trivial: if the graph is Cayley, then for every vertex 
V, the relation =y is exactly the relation under which words are equivalent if and 
only if their values in the group are equal. We write = for the relation =y of every 
vertex v G V. Choose a vertex xq € V, choose a spanning tree rooted at xq, and 
define £{x) for every vertex x gV as in Section 2. Recall the definition of x\-\-X2 
as x\-\-£{x2) in Section 2. It is easy to see that xo is an identity for this + and that 
if w is the label of a path from x\ to xq , then xq + m is an inverse of xi . We have 
that £{xi)^i{x2) = £{xi-\-X2) since both are the colors of paths from xq to xi-\-X2- 
Now the operation + is associative since £{xi-\-{x2-\-xf)) = £{xi)®{£{x2)®£{xf)) 
and £{{xi + X2) + X3) = {£{xi) £{x2)) 0 £{x3). □ 

Lemma 3 . If xi ^ X2, then x\ -\- c ^ X2 -\- c. 

Proof. Suppose that xi ~ X2 . If wi =xi-i-c W2 , then {x\ + c) + = {x\ + c) + UI2 

so + (c 0 wi) = + (c 0 ^2)- In other words, c® wi =3,^ c® W2 and so 

c®w-i =3,2 c®vu2 giving X2-\- {c®wi) = X2-\- {c®W2) and Anally (x2 + c)+tui = 
(X2 -\- c) -\- W2 so that Wi =x2+c W2. □ 
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The following 0(m)-time algorithm tests for two vertices x and y whether 
X ~ y: 

Testing equivalence algorithm 

1 . Choose a spanning tree rooted at x and label the vertices x = Xq , xi , . . . , x„_ i 
according to a pre-order traversal. Traverse the similar tree rooted at y (i.e., 
using the same colors as in the tree rooted at x), checking that no vertex 
is encountered twice and labeling the vertices y = yo,yi^ , j/„_i again in 
pre-order. 

2. For every i < n and every color c € C, consider Xi -I- c = xj and yi + c= yj>] 
if in any case j yf / then return FALSE, else return true. 



Proof. Let Wi be the word in C* labeling the path in the spanning tree from x to 
Xi. If a vertex is encountered twice in traversing the tree rooted at y, say in the 
positions corresponding to Xi and Xj then we have that Wi =y wj but Wi Wj\ 
if we ever found Xi + c = Xj, yi + c = yj> with j j', then Wi ® c =x wj but 
Wi® c Wj. On the other hand, if we always have j = j', we have a way of 
reducing any word to a unique equivalent Wi which is valid for both and 

= y. □ 

Given Lemma 3, we will know that all vertices are equivalent to xq if we find a 
set X = {ci, ..., Ct} of colors such that xq ~ xq -I- Ci {1 < i <f) and every vertex 
is reachable from xq by a path labeled by colors from X. If the graph is Cayley, 
each new color added to X takes us from a subgroup to a larger subgroup and so 
at least doubles the number of reachable vertices. Hence at most min{fc,logn} 
colors can be added before either exhausting V or finding that the graph is not 
Cayley. 

Theorem 1. There is an O {rn min{k, log n}) -time algorithm which, given a 
properly colored k-regular digraph G, checks whether G is color-isomorphic to a 
naturally colored Cayley digraph. 

Proof. Starting with a set S of vertices including only xq, and a set of colors X 
initially empty, we consider each color c G C in turn. If xq -I- c is already in S 
we do nothing. Otherwise we test that xq -I- c ~ xq, rejecting if not, add c to X 
and then add to S all vertices reachable from existing elements of S' by a path 
starting with color c and then using any colors from X. If this exhausts V accept 
and otherwise if the set of colors is exhausted reject. 

Given that we stop the recursive process of adding vertices to S when we 
encounter a vertex already in S, any edge is only considered once and so the 
time is dominated by the at most min{fc,logn} tests for xq -I- c ~ xq each of 
which takes time 0(m) giving the total time bound of 0(?n min{fc, logn}). 

Note that we do not need to test that |S| has at least doubled after each 
incremental operation because the subgraph consisting of the vertices in S and 
the edges between them colored by colors in X is always a Cayley graph. □ 
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We were not able to prove or disprove the optimality of our algorithm. It is 
obviously optimal for bounded degree graphs, but there is a 0(log n) gap between 
its worst-case complexity and the trivial lower bound f2(m) for arbitrary graphs. 
In the next section, we give an optimal algorithm for the class of Abelian Cayley 
graphs. 



4 Recognizing Colored Cayley Digraphs on Abelian 
Groups 

Recall that the class of Abelian Cayley graphs contains networks as popular as 
the hypercubes, the tori, and the circulant graphs. Let n — 0^=1^™* t>e the 
prime-decomposition of n. An Abelian group F of order n is determined by a 
specific decomposition of the mi’s in mi = > ■ • ■ > cti,ri > 1- We 

then get (see for instance [4]): 

r (Z °i,i X • • • X Z oi.n ) X (Z c« 2 ,i X • • • X Z 02 , r, ) x • • • x (Z <»t,i x • • • x Z ) 

^ Pi Pi ^ ^ ^ P 2 P 2 ^ Pt Pt ' 

(1) 

By ordering the oizj’s so that aij > aij+i, and the pi^s so that pi > let 

t 

nj = j = 

i=l 

where r = maxiPj, and aij = 0 for j > ri. The rank of F is then equal to r. 
Note that, for any Cayley digraph G = {F, S) to be connected, we necessarily 
have r < k = l^"!. From the definition of the Uj’s, one can express F in the form 
Z„j X • • • X Z„^, Hi > rii_|_i. (Note that rii+i divides rii.) This latter form is called 
the standard form of F. 

Our recognition algorithm performs in three phases. The first finds a basis 
of the unknown Z-module F corresponding to the input digraph G. Once this 
basis has been determined, that is once the group F has been identified, we try, 
during the second phase, to label the vertices, and to compute a candidate for S 
such that G = (T, S). After that, a one-to-one mapping between the vertex-sets 
of the two graphs has been set. Therefore, the third phase checks whether or 
not the colors of the edges of G and {F, S) correspond by a one-to-one mapping. 
We actually do not label directly the vertices of G by r-tuples because it is not 
computationally cheap to find elements of order ni, . . . , n^. Rather, we still first 
make use of F in its original form depicted by Eq. (1). This will yield labels of 
coordinates. We will refine this construction later, to finally get labels 
on r coordinates. 

Notation. We will denote by (xi, . . . , Xt) the orbit of xi, . . . ,Xt, that is the set 
of the linear combinations of x\, . . . ,Xt, no matter if the Xi’s are vertices of a 
digraph, or elements of a group. 

In the statement of the following lemma, xq refers to the root of the tree S 
defined in Section 2. 
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Lemma 4. There is an 0{k^/n + n)-time algorithm which, given a properly 
colored k-regular digraph G and its correspondence tables, returns 

1. a t-tuple (ri, . . . , rt); 

2. t ri-tuples , cti^n), i G {1, . . . , t}; and 

3. Y!'i=iTi vertices gij, i G {1, . . . ,t}, j G rj; 

such that, if G is color-isomorphic to a naturally colored Abelian Cayley digraph 
on r, then 

- o{gij) = for every i and j; 

- idi.j) (ffi.i) • ■ ■ = { 2 : 0 } for every i, and every j > 1; 

~ {gi,i,. . .,gi,n) n {gi>p, . . .,gi>,r,,) = {a:o} for every i ^ i' ; and 

- L ~ (Z c«l,l X • • • X Z c«l,ri ) X • • • X (Z <»t,l X • • • X Z ot.rt ). 

'■ Pi Pi ^ ^ '■Pt Pt ' 

The proof of Lemma 4 will be given in the full version of the paper. 

Theorem 2. There is an 0{m)-time and 0{m) -space algorithm which, given a 
colored digraph G of m arcs, checks whether G is color-isomorphic to a naturally 
colored Cayley digraph on an Abelian group. In the affirmative, the algorithm 
returns the corresponding Cayley digraph (T,S), T given in the standard form 
r = Z„j X • • • xZ„^, and the two one-to-one mappings 4> '.V ^ T and ip \ C ^ S . 

Proof. As said before, we first check in Opm) time that G is properly colored 
and regular. If it succeeds we apply Lemma 1 and Lemma 4, otherwise return 
FALSE and stop. To label the vertices, we will not directly use the gtjs derived 
in Lemma 4, but we will group some of them together to get a system of r 
coordinates rather than Vi coordinates. Let 

t 

hj = '^9i,j, j = 1, ■ • ■ 

i=l 

where gij = 0 if j > r^. We get o(hj) = Uj for every j, and we label the vertices 
in T = Z„j X • • • X Z„^ as follows. 

Vertex Xq is labeled by the r-tuple (0, 0, . . . , 0). Vertices sfs are simply rep- 
resented by the neighbors of xq. The /3i-th vertex of the orbit of h\ is labeled 
the r-tuple (/3i,0, ... ,0). The /32-th vertex of the orbit of /12 passing through 
(/3i, 0, . . . , 0) (starting from (/3i, 0, . . . , 0)) is labeled (/3i, /32, 0, . . . , 0). And so 
on. Thanks to Lemma 1, each jump along the orbit of hi (starting at xq) 
can be computed in 0(min{fc,f}) time since hi is the sum of multiples of at 
most min{fc,t} colors. Similarly, each jump along the orbit of /li+i, starting at 
(/3i, /32, . . . , /3i, 0, . . . , 0) takes a time 0(min{fc, it}) because ht+i is the sum of 
multiples of at most 0(min{fc, *t}) colors. Therefore, the whole labeling can be 
performed in 0(nm.in{k,rt}) time, that is in at most 0(m) time. Meanwhile, 
the sequence (ni, . . . , Ur) has been computed in at most 0{m) time. 

Once the labeling has been performed, it remains to check that, for every 
i = 1, . . . ,k, all arcs 

(x,y) = {{xi,...,xr),{yi,...,yr)) 



On Recognizing Cayley Graphs 



85 



colored ct satisfy 



1 



?/i - = V'i(ci) (mod ni); 

V 2 - X 2 = ip 2 {ci) (mod 712 ); 



Ur — Xr = tprici) (mod n^); 



( 2 ) 



where ip{c) = . . . ,%pri,c)) is a r-tuple of Z*" depending only on c G 

{ci, . . . ,Cfc}. If these equalities are satisfied by every color a, then G is color- 
isomorphic to (r,S) where S = {ip{ci ), . . . , V^(cfe)}. Otherwise, G is not color- 
isomorphic to a naturally colored Cayley digraph on an Abelian group (and 
we return false and stop). Indeed, the map <() : C — > C such that 4>{aihi + 
. . . -|- arhr) = (ai,...,ai.) is a group-automorphism, and therefore, if G = 
(C, {si, . . . , Sfc}), then ip{ci) is actually the labeling of s^, that is ipici) = 4>{si) = 
where are the k neighbors of xq such that, for every i G 

fc}, = Si -I- xo- A naive component-wise approach to check Eq. (2) 
yields an 0(nkr)-time algorithm with a 0(nfc)-space. As detailed in the full 
version of the paper Eq. (2) can be checked in 0(1) time for each arc by using 
pre-compiited 0(nfc)-space tables requiring 0{nk) time to be set. This can be 
done with the standard 0(logn)-word RAM model [10]. □ 

Recognizing colored Abelian Cayley digraphs of a given rank r can be done by 
a simple variant of the algorithm of Theorem 2. Indeed, once the gij^s have been 
computed in 0{m) time (see Lemma 4), the rank of the group F such that G 
is possibly color-isomorphic to a Cayley digraph on F is determined. Therefore, 
one can stop the algorithm if this rank is different from r. The algorithm will 
perform in 0{m) time as shown in Theorem 2. As a consequence: 

Corollary 2. For any integer r > 0, there is an 0(m)-time and 0{m)-spaee 
algorithm whieh, given a colored digraph G, checks whether G is color-isomorphic 
to a naturally colored Cayley digraph on an Abelian group of rank r. In the 
affirmative, the algorithm returns 

— the corresponding Cayley digraph (T, S) where F given in its standard form; 

— the labeling function </> : E — > T; and 

— the coloring function if \ C ^ S. 



5 Color-Isomorphism for Circulant and Abelian Cayley 
Digraphs 

Circulant graphs have attracted a lot of attention in the literature from both a 
theoretical and practical point of view. Indeed, the algebraic structure of circu- 
lant graphs is simple but rich, and their regularity make them suitable candidates 
for connecting nodes of a network. Let us show how one can check whether two 
given naturally colored circulant graphs are color-isomorphic. The elements of 
the generating set S' of a circulant graph (Z„, S) are called chords. We say that 
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two subsets T and S of Z„ are proportional if there is an integer A such that 
gcd(A,n) = 1 and T = XS. Two circulant digraphs are then said to be Addm- 
isomorphic if the sets of chords are proportional [1]. One can easily show that 
Adam-isomorphism and color-isomorphism between naturally colored circulant 
digraphs are two equivalent notions, that is: 

Lemma 5. Two naturally colored circulant digraphs are color-isomorphic if and 
only if they are A dam-isomorphic. 

Therefore, we get: 

Theorem 3. There is an 0{m)-time algorithm that determines, for any pair 
G and H of naturally colored circulant digraphs, whether G is color- isomorphic 
to H . 

Proof. Let G and H be the two input digraphs, both of order n. Let S', T C Z„ 
be the sets of chords returned by the algorithm of Corollary 2 (with r = 1) 
on G and H respectively. From Lemma 5, G and H are color-isomorphic if 
and only if there exists an invertible constant A in Z„ such that T = AS. Let 
S = {si, . . . , Sk}, and T = {ti, . . . , tk}. There are at most k candidates for A: 
the solutions of U = Asi, i = 1, . . . , fc, if exists. Each candidate can be computed 
in 0(log n) time by Euclid’s algorithm. For every candidate, one checks whether 
T = AS (that can be performed in 0{k) time assuming a pre-computed n bitmap 
encoding the set T). All these operations take 0{n -\- k{logn -\- k)) = 0{n -\- 
time, which is bounded by the time required by the algorithm of Corollary 2. □ 

The previous result can be generalized to Abelian Cayley digraphs by noticing 
that, as a generalization of Lemma 5, two naturally colored Abelian Cayley 
digraphs (T, S) and (T, T) are color-isomorphic if and only if there exists an 
automorphism (j) oi T such that T = 0(S), and for any basis B of the Z-module 
r, 4>{B) is also a basis. Therefore, given two naturally colored Abelian Cayley 
digraphs G\ and G 2 , one can apply Theorem 2 to compute their respective 
groups Ti and 1^2, and their respective generating sets S and T. If Ti A ^ 2 , then 
Gi and G 2 are not color-isomorphic. (This latter checking can be done easily 
be comparing the two sequences n\, . . . ,Ur of Ti and l 2 since our algorithm 
returns the group in its standard form.) Otherwise, one can test all the one-to-one 
functions mapping S to T. Computing one generating subset S' C S oi T (which 
can be done in 0{m) time), one can show that there are only kl/{k— |5"|!) < fc’' 
mappings to test. Choosing a suitable base B, one can compute 4>{B) and check 
whether 4>{B) is a basis in 0{m) time. In total, it gives rise to an 0(fc’’m)-time 
algorithm. 

Corollary 3. There is an 0{kTm)-time algorithm that determines, for any pair 
G, H of naturally colored Cayley digraphs on Abelian groups of rank r, whether 
G is color-isomorphic to H . 

Our algorithm is linear for bounded degree and polynomial for bounded rank. 
We leave as an open problem the following: 
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Problem 1. Design a polynomial-time algorithm that checks color-isomorphism 
between two naturally colored Cayley graphs on Abelian groups of arbitrary 
rank. 
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Abstract. We give algorithms for the directed minimum odd or even cut problem 
and certain generalizations. Our algorithms improve on the previous best ones of 
Goemans and Ramakrishnan by a factor of 0{n) (here n is the size of the ground 
vertex set). Our improvements apply among others to the minimum directed T- 
odd or T-even cut and to the directed minimum Steiner cut problems. The (slightly 
more general) result of Goemans and Ramakrishnan shows that a collection of 
minimal minimizers of a submodular function (i.e. minimum cuts) contains the 
odd minimizers. In contrast our algorithm selects an n-times smaller class of 
not necessarily minimal minimizers and out of these sets we construct the odd 
minimizer. If M (n, m) denotes the time of a u-v minimum cut computation 
in a directed graph with n vertices and m edges, then we may find a directed 
minimum 

- odd or T-odd cut with V (or T) even in 0{r?m + n ■ M{n, m)) time; 

- even or T-even cut in 0{n^m + ■ M (n, m)) time. 

The key of our construction is a so-called parity uncrossing step that, given an 
arbitrary set system with odd intersection, finds an odd set with value not more 
than the maximum of the initial system. 



1 Introduction 

The notion of minimum cuts in graphs and its generalization, the minimum of a sub- 
modular function, plays a key role in combinatorial optimization. See [ 15] for a survey 
of the application of minimum cuts and [ 1 3] for that of submodular functions. 

In this paper we consider minimum cuts or minimizers of submodular functions by 
restricting minimization to either the even or the odd sets. More generally we will min- 
imize in so-called triple families ([9] and [6]), set systems that satisfy certain properties 
of even or odd (for the definition see Section 1.3). The fact that the minimum odd cut or 
T-odd cut (requiring odd intersection with a prescribed set T) problems can be solved 
in polynomial time by maximum flow computations is first shown in [14]; the same for 
even cuts as well as its generalizations for matroids is shown in [2]. 
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